JOIN-ACCUMULATE MACHINE: A SEMI-COHERENT SCALABLE TRUSTLESS VM
DRAFT 0.3.8
September 23, 2024
DR. GAVIN WOOD
FOUNDER, POLKADOT & ETHEREUM
GAVIN@PARITY.IO
Abstract. We present a comprehensive and formal definition of
J
am, a protocol combining elements of both Polkadot
and Ethereum. In a single coherent model,
J
am provides a global singleton permissionless object environment—much
like the smart-contract environment pioneered by Ethereum—paired with secure sideband computation parallelized
over a scalable node network, a proposition pioneered by Polkadot.
J
am introduces a decentralized hybrid system offering smart-contract functionality structured around a secure and
scalable in-core/on-chain dualism. While the smart-contract functionality implies some similarities with Ethereum’s
paradigm, the overall model of the service offered is driven largely by underlying architecture of Polkadot.
J
am is permissionless in nature, allowing anyone to deploy code as a service on it for a fee commensurate with the
resources this code utilizes and to induce execution of this code through the procurement and allocation of core-time,
a metric of resilient and ubiquitous computation, somewhat similar to the purchasing of gas in Ethereum. We already
envision a Polkadot-compatible CoreChains service.
1. Introduction
1.1. Nomenclature. In this paper, we introduce a de-
centralized, crypto-economic protocol to which the Polka-
dot Network could conceivably transition itself in a major
revision. Following this eventuality (which must not be
taken for granted since Polkadot is a decentralized net-
work) this protocol might also become known as Polkadot
or some derivation thereof. However, at this stage this is
not the case, therefore our proposed protocol will for the
present be known as
J
am.
An early, unrefined, version of this protocol was
first proposed in Polkadot Fellowship rfc31, known
as CoreJam. CoreJam takes its name after the col-
lect/refine/join/accumulate model of computation at the
heart of its service proposition. While the CoreJam rfc
suggested an incomplete, scope-limited alteration to the
Polkadot protocol,
J
am refers to a complete and coherent
overall blockchain protocol.
1.2. Driving Factors. Within the realm of blockchain
and the wider Web3, we are driven by the need first and
foremost to deliver resilience. A proper Web3 digital sys-
tem should honor a declared service profile—and ideally
meet even perceived expectations—regardless of the de-
sires, wealth or power of any economic actors including in-
dividuals, organizations and, indeed, other Web3 systems.
Inevitably this is aspirational, and we must be pragmatic
over how perfectly this may really be delivered. Nonethe-
less, a Web3 system should aim to provide such radically
strong guarantees that, for practical purposes, the system
may be described as unstoppable.
While Bitcoin is, perhaps, the first example of such a
system within the economic domain, it was not general
purpose in terms of the nature of the service it offered. A
rules-based service is only as useful as the generality of the
rules which may be conceived and placed within it. Bit-
coin’s rules allowed for an initial use-case, namely a fixed-
issuance token, ownership of which is well-approximated
and autonomously enforced through knowledge of a secret,
as well as some further elaborations on this theme.
Later, Ethereum would provide a categorically more
general-purpose rule set, one which was practically Tur-
ing complete.
1
In the context of Web3 where we are aim-
ing to deliver a massively multiuser application platform,
generality is crucial, and thus we take this as a given.
Beyond resilience and generality, things get more in-
teresting, and we must look a little deeper to understand
1
The gas mechanism did restrict what programs can execute on it by placing an upper bound on the number of steps which may be
executed, but some restriction to avoid infinite-computation must surely be introduced in a permissionless setting.
1
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 2
what our driving factors are. For the present purposes,
we identify three additional goals:
(1) Resilience: highly resistant from being stopped,
corrupted and censored.
(2) Generality: able to perform Turing-complete
computation.
(3) Performance: able to perform computation
quickly and at low cost.
(4) Coherency: the causal relationship possible be-
tween different elements of state and thus how
well individual applications may be composed.
(5) Accessibility: negligible barriers to innovation;
easy, fast, cheap and permissionless.
As a declared Web3 technology, we make an implicit
assumption of the first two items. Interestingly, items 3
and 4 are antagonistic according to an information the-
oretic principle which we are sure must already exist in
some form but are nonetheless unaware of a name for it.
For argument’s sake we shall name it size-synchrony an-
tagonism.
1.3. Scaling under Size-Synchrony Antagonism.
Size-synchrony antagonism is a simple principle implying
that as the state-space of information systems grow, then
the system necessarily becomes less synchronous. The ar-
gument goes:
(1) The more state a system utilizes for its data-
processing, the greater the amount of space this
state must occupy.
(2) The more space used, then the greater the
mean and variance of distances between state-
components.
(3) As the mean and variance increase, then interac-
tions become slower and subsystems must manage
the possibility that distances between interdepen-
dent components of state could be materially dif-
ferent, requiring asynchrony.
This assumes perfect coherency of the system’s state.
Setting the question of overall security aside for a moment,
we can avoid this rule by applying the divide and conquer
maxim and fragmenting the state of a system, sacrificing
its coherency. We might for example create two inde-
pendent smaller-state systems rather than one large-state
system. This pattern applies a step-curve to the principle;
intra-system processing has low size and high synchrony,
inter-system processing has high size but low synchrony.
It is the principle behind meta-networks such as Polkadot,
Cosmos and the predominant vision of a scaled Ethereum
(all to be discussed in depth shortly).
The present work explores a middle-ground in the an-
tagonism, avoiding the persistent fragmentation of state-
space of the system as with existing approaches. We do
this by introducing a new model of computation which
pipelines a highly scalable element to a highly synchro-
nous element. Asynchrony is not avoided, but we do open
the possibility for a greater degree of granularity over how
it is traded against size. In particular fragmentation can
be made ephemeral rather than persistent, drawing upon
a coherent state and fragmenting it only for as long as it
takes to execute any given piece of processing on it.
Unlike with snark-based L2-blockchain techniques for
scaling, this model draws upon crypto-economic mecha-
nisms and inherits their low-cost and high-performance
profiles and averts a bias toward centralization.
1.4. Document Structure. We begin with a brief
overview of present scaling approaches in blockchain tech-
nology in section 2. In section 3 we define and clarify the
notation from which we will draw for our formalisms.
We follow with a broad overview of the protocol in sec-
tion 4 outlining the major areas including the Polka Vir-
tual Machine (pvm), the consensus protocols Safrole and
Grandpa, the common clock and build the foundations
of the formalism.
We then continue with the full protocol definition split
into two parts: firstly the correct on-chain state-transition
formula helpful for all nodes wishing to validate the chain
state, and secondly, in sections 14 and 19 the honest strat-
egy for the off-chain actions of any actors who wield a
validator key.
The main body ends with a discussion over the per-
formance characteristics of the protocol in section 20 and
finally conclude in section 21.
The appendix contains various additional material im-
portant for the protocol definition including the pvm in
appendices A & B, serialization and Merklization in ap-
pendices C & D and cryptography in appendices E, G &
H. We finish with an index of terms which includes the
values of all simple constant terms used in the work in
appendix I, and close with the bibliography.
2. Previous Work and Present Trends
In the years since the initial publication of the
Ethereum YP, the field of blockchain development has
grown immensely. Other than scalability, development
has been done around underlying consensus algorithms,
smart-contract languages and machines and overall state
environments. While interesting, these latter subjects are
mostly out scope of the present work since they generally
do not impact underlying scalability.
2.1. Polkadot. In order to deliver its service,
J
am co-
opts much of the same game-theoretic and cryptographic
machinery as Polkadot known as Elves and described by
Jeff Burdges, Cevallos, et al. 2024. However, major differ-
ences exist in the actual service offered with
J
am, provid-
ing an abstraction much closer to the actual computation
model generated by the validator nodes its economy in-
centivizes.
It was a major point of the original Polkadot pro-
posal, a scalable heterogeneous multichain, to deliver high-
performance through partition and distribution of the
workload over multiple host machines. In doing so it took
an explicit position that composability would be lowered.
Polkadot’s constituent components, parachains are, prac-
tically speaking, highly isolated in their nature. Though a
message passing system (xcmp) exists it is asynchronous,
coarse-grained and practically limited by its reliance on a
high-level slowly evolving interaction language xcm.
As such, the composability offered by Polkadot be-
tween its constituent chains is lower than that of
Ethereum-like smart-contract systems offering a single
and universal object environment and allowing for the
kind of agile and innovative integration which underpins
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 3
their success. Polkadot, as it stands, is a collection of
independent ecosystems with only limited opportunity
for collaboration, very similar in ergonomics to bridged
blockchains though with a categorically different security
profile. A technical proposal known as spree would uti-
lize Polkadot’s unique shared-security and improve com-
posability, though blockchains would still remain isolated.
Implementing and launching a blockchain is hard, time-
consuming and costly. By its original design, Polkadot
limits the clients able to utilize its service to those who
are both able to do this and raise a sufficient deposit to
win an auction for a long-term slot, one of around 50 at
the present time. While not permissioned per se, acces-
sibility is categorically and substantially lower than for
smart-contract systems similar to Ethereum.
Enabling as many innovators to participate and inter-
act, both with each other and each other’s user-base, ap-
pears to be an important component of success for a Web3
application platform. Accessibility is therefore crucial.
2.2. Ethereum. The Ethereum protocol was formally de-
fined in this paper’s spiritual predecessor, the Yellow Pa-
per, by Wood 2014. This was derived in large part from
the initial concept paper by Buterin 2013. In the decade
since the YP was published, the de facto Ethereum proto-
col and public network instance have gone through a num-
ber of evolutions, primarily structured around introducing
flexibility via the transaction format and the instruction
set and “precompiles” (niche, sophisticated bonus instruc-
tions) of its scripting core, the Ethereum virtual machine
(evm).
Almost one million crypto-economic actors take part
in the validation for Ethereum.
2
Block extension is done
through a randomized leader-rotation method where the
physical address of the leader is public in advance of
their block production.
3
Ethereum uses Casper-FFG in-
troduced by Buterin and Griffith 2019 to determine final-
ity, which with the large validator base finalizes the chain
extension around every 13 minutes.
Ethereum’s direct computational performance remains
broadly similar to that with which it launched in 2015,
with a notable exception that an additional service now
allows 1mb of commitment data to be hosted per block
(all nodes to store it for a limited period). The data can-
not be directly utilized by the main state-transition func-
tion, but special functions provide proof that the data
(or some subsection thereof) is available. According to
Ethereum Foundation 2024b, the present design direction
is to improve on this over the coming years by splitting
responsibility for its storage amongst the validator base in
a protocol known as Dank-sharding.
According to Ethereum Foundation 2024a, the scaling
strategy of Ethereum would be to couple this data avail-
ability with a private market of roll-ups, sideband com-
putation facilities of various design, with zk-snark-based
roll-ups being a stated preference. Each vendor’s roll-up
design, execution and operation comes with its own impli-
cations.
One might reasonably assume that a diversified market-
based approach for scaling via multivendor roll-ups will al-
low well-designed solutions to thrive. However, there are
potential issues facing the strategy. A research report by
Sharma 2023 on the level of decentralization in the vari-
ous roll-ups found a broad pattern of centralization, but
notes that work is underway to attempt to mitigate this.
It remains to be seen how decentralized they can yet be
made.
Heterogeneous communication properties (such as
datagram latency and semantic range), security properties
(such as the costs for reversion, corruption, stalling and
censorship) and economic properties (the cost of accept-
ing and processing some incoming message or transaction)
may differ, potentially quite dramatically, between major
areas of some grand patchwork of roll-ups by various com-
peting vendors. While the overall Ethereum network may
eventually provide some or even most of the underlying
machinery needed to do the sideband computation it is
far from clear that there would be a “grand consolidation”
of the various properties should such a thing happen. We
have not found any good discussion of the negative rami-
fications of such a fragmented approach.
4
2.2.1. Snark Roll-ups. While the protocol’s foundation
makes no great presuppositions on the nature of roll-ups,
Ethereum’s strategy for sideband computation does cen-
tre around snark-based rollups and as such the protocol
is being evolved into a design that makes sense for this.
Snarks are the product of an area of exotic cryptography
which allow proofs to be constructed to demonstrate to a
neutral observer that the purported result of performing
some predefined computation is correct. The complexity
of the verification of these proofs tends to be sub-linear in
their size of computation to be proven and will not give
away any of the internals of said computation, nor any
dependent witness data on which it may rely.
Zk-snarks come with constraints. There is a trade-off
between the proof’s size, verification complexity and the
computational complexity of generating it. Non-trivial
computation, and especially the sort of general-purpose
computation laden with binary manipulation which makes
smart-contracts so appealing, is hard to fit into the model
of snarks.
To give a practical example, risc-zero (as assessed by
Bögli 2024) is a leading project and provides a platform for
producing snarks of computation done by a risc-v vir-
tual machine, an open-source and succinct risc machine
architecture well-supported by tooling. A recent bench-
marking report by PolkaVM Project 2024 showed that
compared to risc-zero’s own benchmark, proof generation
alone takes over 61,000 times as long as simply recompiling
and executing even when executing on 32 times as many
cores, using 20,000 times as much ram and an additional
state-of-the-art gpu. According to hardware rental agents
https://cloud-gpus.com/, the cost multiplier of proving
2
Practical matters do limit the level of real decentralization. Validator software expressly provides functionality to allow a single instance
to be configured with multiple key sets, systematically facilitating a much lower level of actual decentralization than the apparent number
of actors, both in terms of individual operators and hardware. Using data collated by Dune and hildobby 2024 on Ethereum 2, one can see
one major node operator, Lido, has steadily accounted for almost one-third of the almost one million crypto-economic participants.
3
Ethereum’s developers hope to change this to something more secure, but no timeline is fixed.
4
Some initial thoughts on the matter resulted in a proposal by Sadana 2024 to utilize Polkadot technology as a means of helping create
a modicum of compatibility between roll-up ecosystems!
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 4
using risc-zero is 66,000,000x of the cost
5
to execute using
our risc-v recompiler.
Many cryptographic primitives become too expensive
to be practical to use and specialized algorithms and struc-
tures must be substituted. Often times they are otherwise
suboptimal. In expectation of the use of snarks (such as
plonk as proposed by Gabizon, Williamson, and Ciobo-
taru 2019), the prevailing design of the Ethereum project’s
Dank-sharding availability system uses a form of erasure
coding centered around polynomial commitments over a
large prime field in order to allow snarks to get accept-
ably performant access to subsections of data. Compared
to alternatives, such as a binary field and Merklization in
the present work, it leads to a load on the validator nodes
orders of magnitude higher in terms of cpu usage.
In addition to their basic cost, snarks present no great
escape from decentralization and the need for redundancy,
leading to further cost multiples. While the need for some
benefits of staked decentralization is averted through their
verifiable nature, the need to incentivize multiple parties
to do much the same work is a requirement to ensure that
a single party not form a monopoly (or several not form
a cartel). Proving an incorrect state-transition should be
impossible, however service integrity may be compromised
in other ways; a temporary suspension of proof-generation,
even if only for minutes, could amount to major economic
ramifications for real-time financial applications.
Real-world examples exist of the pit of centralization
giving rise to monopolies. One would be the aforemen-
tioned snark-based exchange framework; while notionally
serving decentralized exchanges, it is in fact centralized
with Starkware itself wielding a monopoly over enacting
trades through the generation and submission of proofs,
leading to a single point of failure—should Starkware’s ser-
vice become compromised, then the liveness of the system
would suffer.
It has yet to be demonstrated that snark-based strate-
gies for eliminating the trust from computation will ever
be able to compete on a cost-basis with a multi-party
crypto-economic platform. All as-yet proposed snark-
based solutions are heavily reliant on crypto-economic sys-
tems to frame them and work around their issues. Data
availability and sequencing are two areas well understood
as requiring a crypto-economic solution.
We would note that snark technology is improving
and the cryptographers and engineers behind them do ex-
pect improvements in the coming years. In a recent arti-
cle by Thaler 2023 we see some credible speculation that
with some recent advancements in cryptographic tech-
niques, slowdowns for proof generation could be as lit-
tle as 50,000x from regular native execution and much
of this could be parallelized. This is substantially bet-
ter than the present situation, but still several orders of
magnitude greater than would be required to compete on
a cost-basis with established crypto-economic techniques
such as Elves.
2.3. Fragmented Meta-Networks. Directions for
general-purpose computation scalability taken by other
projects broadly centre around one of two approaches;
either what might be termed a fragmentation approach
or alternatively a centralization approach. We argue that
neither approach offers a compelling solution.
The fragmentation approach is heralded by projects
such as Cosmos (proposed by Kwon and Buchman 2019)
and Avalanche (by Tanana 2019). It involves a system
fragmented by networks of a homogenous consensus me-
chanic, yet staffed by separately motivated sets of valida-
tors. This is in contrast to Polkadot’s single validator set
and Ethereum’s declared strategy of heterogeneous roll-
ups secured partially by the same validator set operating
under a coherent incentive framework. The homogeneity
of said fragmentation approach allows for reasonably con-
sistent messaging mechanics, helping to present a fairly
unified interface to the multitude of connected networks.
However, the apparent consistency is superficial. The
networks are trustless only by assuming correct operation
of their validators, who operate under a crypto-economic
security framework ultimately conjured and enforced by
economic incentives and punishments. To do twice as
much work with the same levels of security and no special
coordination between validator sets, then such systems es-
sentially prescribe forming a new network with the same
overall levels of incentivization.
Several problems arise. Firstly, there is a simi-
lar downside as with Polkadot’s isolated parachains and
Ethereum’s isolated roll-up chains: a lack of coherency
due to a persistently sharded state preventing synchro-
nous composability.
More problematically, the scaling-by-fragmentation
approach, proposed specifically by Cosmos, provides
no homogenous security—and therefore trustlessness—
guarantees. Validator sets between networks must be
assumed to be independently selected and incentivized
with no relationship, causal or probabilistic, between the
Byzantine actions of a party on one network and potential
for appropriate repercussions on another. Essentially, this
means that should validators conspire to corrupt or revert
the state of one network, the effects may be felt across
other networks of the ecosystem.
That this is an issue is broadly accepted, and projects
propose for it to be addressed in one of two ways. Firstly,
to fix the expected cost-of-attack (and thus level of se-
curity) across networks by drawing from the same val-
idator set. The massively redundant way of doing this,
as proposed by Cosmos Project 2023 under the name
replicated security, would be to require each validator
to validate on all networks and for the same incentives
and punishments. This is economically inefficient in the
cost of security provision as each network would need to
independently provide the same level of incentives and
punishment-requirements as the most secure with which
it wanted to interoperate. This is to ensure the economic
proposition remain unchanged for validators and the se-
curity proposition remained equivalent for all networks.
At the present time, replicated security is not a readily
available permissionless service. We might speculate that
these punishing economics have something to do with it.
The more efficient approach, proposed by the Om-
niLedger team, Kokoris-Kogias et al. 2017, would be to
make the validators non-redundant, partitioning them be-
tween different networks and periodically, securely and
5
In all likelihood actually substantially more as this was using low-tier “spare” hardware in consumer units, and our recompiler was
unoptimized.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 5
randomly repartitioning them. A reduction in the cost
to attack over having them all validate on a single net-
work is implied since there is a chance of having a single
network accidentally have a compromising number of ma-
licious validators even with less than this proportion over-
all. This aside it presents an effective means of scaling
under a basis of weak-coherency.
Alternatively, as in Elves by Jeff Burdges, Cevallos,
et al. 2024, we may utilize non-redundant partitioning,
combine this with a proposal-and-auditing game which
validators play to weed out and punish invalid computa-
tions, and then require that the finality of one network
be contingent on all causally-entangled networks. This
is the most secure and economically efficient solution of
the three, since there is a mechanism for being highly
confident that invalid transitions will be recognized and
corrected before their effect is finalized across the ecosys-
tem of networks. However, it requires substantially more
sophisticated logic and their causal-entanglement implies
some upper limit on the number of networks which may
be added.
2.4. High-Performance Fully Synchronous Net-
works. Another trend in the recent years of blockchain
development has been to make “tactical” optimizations
over data throughput by limiting the validator set size or
diversity, focusing on software optimizations, requiring a
higher degree of coherency between validators, onerous re-
quirements on the hardware which validators must have,
or limiting data availability.
The Solana blockchain is underpinned by technology
introduced by Yakovenko 2018 and boasts theoretical fig-
ures of over 700,000 transactions per second, though ac-
cording to Ng 2024 the network is only seen processing a
small fraction of this. The underlying throughput is still
substantially more than most blockchain networks and is
owed to various engineering optimizations in favor of max-
imizing synchronous performance. The result is a highly-
coherent smart-contract environment with an api not un-
like that of YP Ethereum (albeit using a different under-
lying vm), but with a near-instant time to inclusion and
finality which is taken to be immediate upon inclusion.
Two issues arise with such an approach: firstly, defin-
ing the protocol as the outcome of a heavily optimized
codebase creates structural centralization and can under-
mine resilience. Jha 2024 writes “since January 2022, 11
significant outages gave rise to 15 days in which major
or partial outages were experienced”. This is an outlier
within the major blockchains as the vast majority of ma-
jor chains have no downtime. There are various causes to
this downtime, but they are generally due to bugs found
in various subsystems.
Ethereum, at least until recently, provided the most
contrasting alternative with its well-reviewed specifica-
tion, clear research over its crypto-economic foundations
and multiple clean-room implementations. It is per-
haps no surprise that the network very notably contin-
ued largely unabated when a flaw in its most deployed
implementation was found and maliciously exploited, as
described by Hertig 2016.
The second issue is concerning ultimate scalability of
the protocol when it provides no means of distributing
workload beyond the hardware of a single machine.
In major usage, both historical transaction data and
state would grow impractically. Solana illustrates how
much of a problem this can be. Unlike classical
blockchains, the Solana protocol offers no solution for the
archival and subsequent review of historical data, crucial
if the present state is to be proven correct from first prin-
ciple by a third party. There is little information on how
Solana manages this in the literature, but according to
Solana Foundation 2023, nodes simply place the data onto
a centralized database hosted by Google.
6
Solana validators are encouraged to install large
amounts of ram to help hold its large state in mem-
ory (512 gb is the current recommendation according to
Solana Labs 2024). Without a divide-and-conquer ap-
proach, Solana shows that the level of hardware which
validators can reasonably be expected to provide dictates
the upper limit on the performance of a totally synchro-
nous, coherent execution model. Hardware requirements
represent barriers to entry for the validator set and cannot
grow without sacrificing decentralization and, ultimately,
transparency.
3. Notational Conventions
Much as in the Ethereum Yellow Paper, a number of
notational conventions are used throughout the present
work. We define them here for clarity. The Ethereum
Yellow Paper itself may be referred to henceforth as the
YP.
3.1. Typography. We use a number of different type-
faces to denote different kinds of terms. Where a term is
used to refer to a value only relevant within some localized
section of the document, we use a lower-case roman letter
e.g. x, y (typically used for an item of a set or sequence)
or e.g. i, j (typically used for numerical indices). Where
we refer to a Boolean term or a function in a local context,
we tend to use a capitalized roman alphabet letter such as
A, F . If particular emphasis is needed on the fact a term
is sophisticated or multidimensional, then we may use a
bold typeface, especially in the case of sequences and sets.
For items which retain their definition throughout the
present work, we use other typographic conventions. Sets
are usually referred to with a blackboard typeface, e.g. N
refers to all natural numbers including zero. Sets which
may be parameterized may be subscripted or be followed
by parenthesized arguments. Imported functions, used by
the present work but not specifically introduced by it, are
written in calligraphic typeface, e.g. H the Blake2 cryp-
tographic hashing function. For other non-context depen-
dent functions introduced in the present work, we use up-
per case Greek letters, e.g. Υ denotes the state transition
function.
Values which are not fixed but nonetheless hold some
consistent meaning throughout the present work are de-
noted with lower case Greek letters such as σ, the state
identifier. These may be placed in bold typeface to denote
that they refer to an abnormally complex value.
6
Earlier node versions utilized Arweave network, a decentralized data store, but this was found to be unreliable for the data throughput
which Solana required.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 6
3.2. Functions and Operators. We define the precedes
relation to indicate that one term is defined in terms of
another. E.g. y x indicates that y may be defined purely
in terms of x:
y x f y = f(x)(1)
The substitute-if-nothing function U is equivalent to
the first argument which is not , or if no such argu-
ment exists:
U(a
0
, . . . a
n
) a
x
(a
x
x = n),
x1
i=0
a
i
= (2)
Thus, e.g. U(, 1, , 2)= 1 and U(, )= .
3.3.
Sets.
Given some set
s
, its power set and cardinality
are denoted as the usual s and s. When forming a
power set, we may use a numeric subscript in order to re-
strict the resultant expansion to a particular cardinality.
E.g. {1, 2, 3}
2
= {{1, 2}, {1, 3}, {2, 3}}.
Sets may be operated on with scalars, in which case the
result is a set with the operation applied to each element,
e.g. {1, 2, 3}+ 3 = {4, 5, 6}
We denote set-disjointness with the relation . For-
mally:
A B = A B
We commonly use to indicate that some term is
validly left without a specific value. Its cardinality is
defined as zero. We define the operation ? such that
A? A {} indicating the same set but with the ad-
dition of the element.
The term is utilized to indicate the unexpected fail-
ure of an operation or that a value is invalid or unexpected.
(We try to avoid the use of the more conventional here
to avoid confusion with Boolean false, which may be in-
terpreted as some successful result in some contexts.)
3.4. Numbers. N denotes the set of naturals including
zero whereas N
n
implies a restriction on that set to values
less than n. Formally, N = {0, 1, . . . } and N
n
= {x x
N, x < n}.
Z denotes the set of integers. We denote Z
a...b
to be
the set of integers within the interval [a, b). Formally,
Z
a...b
= {x x Z, a x < b}. E.g. Z
2...5
= {2, 3, 4}. We
denote the offset/length form of this set as Z
a⋅⋅⋅+b
, a short
form of Z
a...a+b
.
It can sometimes be useful to represent lengths of se-
quences and yet limit their size, especially when dealing
with sequences of octets which must be stored practically.
Typically, these lengths can be defined as the set N
2
32
.
To improve clarity, we denote N
L
as the set of lengths of
octet sequences and is equivalent to N
2
32
.
We denote the % operator as the modulo operator,
e.g. 5 % 3 = 2. Furthermore, we may occasionally express
a division result as a quotient and remainder with the
separator R , e.g. 5 ÷ 3 = 1 R 2.
3.5. Dictionaries. A dictionary is a possibly partial
mapping from some domain into some co-domain in much
the same manner as a regular function. Unlike functions
however, with dictionaries the total set of pairings are
necessarily enumerable, and we represent them in some
data structure as the set of all (key value) pairs. (In
such data-defined mappings, it is common to name the
values within the domain a key and the values within the
co-domain a value, hence the naming.)
Thus, we define the formalism DK V to denote a
dictionary which maps from the domain K to the range
V. We define a dictionary as a member of the set of all
dictionaries D and a set of pairs p = (k v):
D
{(k v)}
(3)
A dictionary’s members must associate at most one
unique value for any key k:
d D (k v) d !v
(k v
) d(4)
This assertion allows us to unambiguously define the
subscript and subtraction operator for a dictionary d:
d D d[k]
v if k (k v) d
otherwise
(5)
d D, s K d s {(k v) (k v) d, k s}(6)
Note that when using a subscript, it is an implicit as-
sertion that the key exists in the dictionary. Should the
key not exist, the result is undefined and any block which
relies on it must be considered invalid.
It is typically useful to limit the sets from which the
keys and values may be drawn. Formally, we define a
typed dictionary DK V as a set of pairs p of the form
(k v):
DK V D(7)
DK V
{(k v)k K v V }
(8)
To denote the active domain (i.e. set of keys) of a dic-
tionary d DK V , we use K(d) K and for the range
(i.e. set of values), V(d) V . Formally:
K(d D) { k v (k v) d }(9)
V(d D) { v k (k v) d }(10)
Note that since the co-domain of V is a set, should dif-
ferent keys with equal values appear in the dictionary, the
set will only contain one such value.
3.6. Tuples. Tuples are groups of values where each item
may belong to a different set. They are denoted with
parentheses, e.g. the tuple t of the naturals 3 and 5 is de-
noted t = (3, 5), and it exists in the set of natural pairs
sometimes denoted N×N, but denoted in the present work
as (N, N).
We have frequent need to refer to a specific item within
a tuple value and as such find it convenient to declare a
name for each item. E.g. we may denote a tuple with two
named natural components a and b as T =
a
N
, b
N
.
We would denote an item t T through subscripting its
name, thus for some t =
a
3, b
5
, t
a
= 3 and t
b
= 5.
3.7. Sequences. A sequence is a series of elements with
particular ordering not dependent on their values. The set
of sequences of elements all of which are drawn from some
set T is denoted T , and it defines a partial mapping
N T . The set of sequences containing exactly n ele-
ments each a member of the set T may be denoted T
n
and accordingly defines a complete mapping N
n
T . Sim-
ilarly, sets of sequences of at most n elements and at least
n elements may be denoted T
n
and T
n
respectively.
Sequences are subscriptable, thus a specific item at in-
dex i within a sequence s may be denoted s[i], or where
unambiguous, s
i
. A range may be denoted using an ellip-
sis for example: [0, 1, 2, 3]
...2
= [0, 1] and [0, 1, 2, 3]
1⋅⋅⋅+2
=
[1, 2]. The length of such a sequence may be denoted s.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 7
We denote modulo subscription as s[i]
s[i % s].
We denote the final element x of a sequence s = [..., x]
through the function last(s) x.
3.7.1. Construction. We may wish to define a sequence
in terms of incremental subscripts of other values:
[x
0
, x
1
, . . . ]
n
denotes a sequence of n values beginning
x
0
continuing up to x
n1
. Furthermore, we may also
wish to define a sequence as elements each of which
are a function of their index i; in this case we denote
[f(i) i <
N
n
] [f (0), f (1), . . . , f (n 1)]. Thus, when
the ordering of elements matters we use <
rather than
the unordered notation . The latter may also be written
in short form [f (i <
N
n
)]. This applies to any set which
has an unambiguous ordering, particularly sequences, thus
[i
2
i <
[1, 2, 3]] = [1, 4, 9]. Multiple sequences may be
combined, thus [i j i <
[1, 2, 3], j <
[2, 3, 4]]= [2, 6, 12].
We use explicit notation f
#
to denote a function map-
ping over all items of a sequence. Thus given some func-
tion y = f (x):
(11) [f(x
0
), f (x
1
), . . . ]= f
#
([x
0
, x
1
, . . . ])
Sequences may be constructed from sets or other se-
quences whose order should be ignored through sequence
ordering notation [i
k
i X], which is defined to result
in the set or sequence of its argument except that all ele-
ments i are placed in ascending order of the corresponding
value i
k
.
The key component may be elided in which case it is as-
sumed to be ordered by the elements directly; i.e. [i X]
[i
i X]. [i
k
i X] does the same, but excludes any
duplicate values of i. E.g. assuming s = [1, 3, 2, 3], then
[i
i s]= [1, 2, 3] and [i
i s]= [3, 3, 2, 1].
Sets may be constructed from sequences with the reg-
ular set construction syntax, e.g. assuming s = [1, 2, 3, 1],
then {a a s} would be equivalent to {1, 2, 3}.
Sequences of values which themselves have a defined
ordering have an implied ordering akin to a regular dic-
tionary, thus [1, 2, 3]< [1, 2, 4] and [1, 2, 3]< [1, 2, 3, 1].
3.7.2. Editing. We define the sequence concatenation op-
erator such that [x
0
, x
1
, . . . , y
0
, y
1
, . . . ] x y. For
sequences of sequences, we define a unary concatenate-all
operator:
x x
0
x
1
. . . . Further, we denote ele-
ment concatenation as x i x [i]. We denote the
sequence made up of the first n elements of sequence s to
be
Ð
s
n
[s
0
, s
1
, . . . , s
n1
], and only the final elements as
Ð
s
n
.
We define
T
x as the transposition of the sequence-of-
sequences x, fully defined in equation 317. We may also
apply this to sequences-of-tuples to yield a tuple of se-
quences.
We denote sequence subtraction with a slight modifica-
tion of the set subtraction operator; specifically, some se-
quence s excepting the left-most element equal to v would
be denoted s m {v}.
3.7.3. Boolean values. B
s
denotes the set of Boolean
strings of length s, thus B
s
= {, }
s
. When dealing
with Boolean values we may assume an implicit equiva-
lence mapping to a bit whereby = 1 and = 0, thus
B
= N
2
. We use the function bits(Y) B to de-
note the sequence of bits, ordered with the least signif-
icant first, which represent the octet sequence Y, thus
bits([5, 0])= [1, 0, 1, 0, 0, . . . ].
3.7.4. Octets and Blobs. Y denotes the set of octet strings
(“blobs”) of arbitrary length. As might be expected, Y
x
denotes the set of such sequences of length x. Y
$
denotes
the subset of Y which are ASCII-encoded strings. Note
that while an octet has an implicit and obvious bijec-
tive relationship with natural numbers less than 256, and
we may implicitly coerce between octet form and natural
number form, we do not treat them as exactly equivalent
entities. In particular for the purpose of serialization, an
octet is always serialized to itself, whereas a natural num-
ber may be serialized as a sequence of potentially several
octets, depending on its magnitude and the encoding vari-
ant.
3.7.5. Shuffling. We define the sequence-shuffle function
F, originally introduced by Fisher and Yates 1938, with an
efficient in-place algorithm described by Wikipedia 2024.
This accepts a sequence and some entropy and returns a
sequence of the same length with the same elements but
in an order determined by the entropy. The entropy may
be provided as either an indefinite sequence of naturals or
a hash. For a full definition see appendix F.
3.8. Cryptography.
3.8.1. Hashing. H denotes the set of 256-bit values typi-
cally expected to be arrived at through a cryptographic
function, equivalent to Y
32
, with H
0
being equal to [0]
32
.
We assume a function H(m Y) H denoting the Blake2b
256-bit hash introduced by Saarinen and Aumasson 2015
and a function H
K
(m Y) H denoting the Keccak 256-
bit hash as proposed by Bertoni et al. 2013 and utilized
by Wood 2014.
We may sometimes wish to take only the first x octets
of a hash, in which case we denote H
x
(m) Y
x
to be the
first x octets of H(m). The inputs of a hash function are
generally assumed to be serialized with our codec E(x) Y,
however for the purposes of clarity or unambiguity we may
also explicitly denote the serialization. Similarly, we may
wish to interpret a sequence of octets as some other kind
of value with the assumed decoder function E
1
(x Y). In
both cases, we may subscript the transformation function
with the number of octets we expect the octet sequence
term to have. Thus, r = E
4
(x N) would assert x N
2
32
and
r
Y
4
, whereas s = E
1
8
(y) would assert y Y
8
and
s N
2
64
.
3.8.2. Signing Schemes. E
k
m Y
64
is the set of valid
Ed25519 signatures, defined by Josefsson and Liusvaara
2017, made through knowledge of a secret key whose pub-
lic key counterpart is k Y
32
and whose message is m.
To aid readability, we denote the set of valid public keys
k H
E
.
We use Y
BLS
Y
144
to denote the set of public keys for
the bls signature scheme, described by Boneh, Lynn, and
Shacham 2004, on curve bls12-381 defined by Hopwood
et al. 2020.
We denote the set of valid Bandersnatch public keys as
H
B
, defined in appendix G. F
mY
kH
B
x Y Y
96
is the set
of valid singly-contextualized signatures of utilizing the se-
cret counterpart to the public key k, some context x and
message m.
¯
F
mY
rY
R
x Y Y
784
, meanwhile, is the set of valid Ban-
dersnatch Ringvrf deterministic singly-contextualized
proofs of knowledge of a secret within some set of secrets
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 8
identified by some root in the set of valid roots Y
R
Y
144
.
We denote O(s H
B
) Y
R
to be the root specific to the
set of public key counterparts s. A root implies a specific
set of Bandersnatch key pairs, knowledge of one of the
secrets would imply being capable of making a unique,
valid—and anonymous—proof of knowledge of a unique
secret within the set.
Both the Bandersnatch signature and Ringvrf proof
strictly imply that a member utilized their secret key in
combination with both the context x and the message m;
the difference is that the member is identified in the former
and is anonymous in the latter. Furthermore, both define
a vrf output, a high entropy hash influenced by x but not
by m, formally denoted Y(
¯
F
m
r
x) H and Y(F
m
k
x) H.
We define the function S as the signature function, such
that S
k
(m) F
m
k
[] E
k
m. We assert that the ability
to compute a result for this function relies on knowledge
of a secret key.
4. Overview
As in the Yellow Paper, we begin our formalisms by
recalling that a blockchain may be defined as a pairing
of some initial state together with a block-level state-
transition function. The latter defines the posterior state
given a pairing of some prior state and a block of data
applied to it. Formally, we say:
σ
Υ(σ, B)(12)
Where σ is the prior state, σ
is the posterior state, B is
some valid block and Υ is our block-level state-transition
function.
Broadly speaking,
J
am (and indeed blockchains in gen-
eral) may be defined simply by specifying Υ and some gen-
esis state σ
0
.
7
We also make several additional assump-
tions of agreed knowledge: a universally known clock, and
the practical means of sharing data with other systems
operating under the same consensus rules. The latter two
were both assumptions silently made in the YP.
4.1. The Block. To aid comprehension and definition of
our protocol, we partition as many of our terms as possible
into their functional components. We begin with the block
B which may be restated as the header H and some input
data external to the system and thus said to be extrinsic,
E:
B (H, E)(13)
E (E
T
, E
D
, E
P
, E
A
, E
G
)(14)
The header is a collection of metadata primarily con-
cerned with cryptographic references to the blockchain an-
cestors and the operands and result of the present tran-
sition. As an immutable known a priori, it is assumed
to be available throughout the functional components of
block transition. The extrinsic data is split into its several
portions:
tickets: Tickets, used for the mechanism which
manages the selection of validators for the per-
missioning of block authoring. This component is
denoted E
T
.
judgements: Votes, by validators, on dispute(s)
arising between them presently taking place. This
is denoted E
D
.
preimages: Static data which is presently being re-
quested to be available for workloads to be able
to fetch on demand. This is denoted E
P
.
availability: Assurances by each validator concern-
ing which of the input data of workloads they have
correctly received and are storing locally. This is
denoted E
A
.
reports: Reports of newly completed workloads
whose accuracy is guaranteed by specific valida-
tors. This is denoted E
G
.
4.2. The State. Our state may be logically partitioned
into several largely independent segments which can both
help avoid visual clutter within our protocol description
and provide formality over elements of computation which
may be simultaneously calculated (i.e. parallelized). We
therefore pronounce an equivalence between σ (some com-
plete state) and a tuple of partitioned segments of that
state:
σ (α, β, γ, δ, η, ι, κ, λ, ρ, τ, φ, χ, ψ, π)(15)
In summary, δ is the portion of state dealing with ser-
vices, analogous in
J
am to the Yellow Paper’s (smart con-
tract) accounts, the only state of the YP’s Ethereum. The
identities of services which hold some privileged status are
tracked in χ.
Validators, who are the set of economic actors uniquely
privileged to help build and maintain the
J
am chain, are
identified within κ, archived in λ and enqueued from ι. All
other state concerning the determination of these keys is
held within γ. Note this is a departure from the YP proof-
of-work definitions which were mostly stateless, and this
set was not enumerated but rather limited to those with
sufficient compute power to find a partial hash-collision in
the sha2-256 cryptographic hash function. An on-chain
entropy pool is retained in η.
Our state also tracks two aspects of each core: α, the
authorization requirement which work done on that core
must satisfy at the time of being reported on-chain, to-
gether with the queue which fills this, φ; and ρ, each of the
cores’ currently assigned report, the availability of whose
work-package must yet be assured by a super-majority of
validators.
Finally, details of the most recent blocks and time
are tracked in β and τ respectively and, judgements are
tracked in ψ and validator statistics are tracked in π.
4.2.1. State Transition Dependency Graph. Much as in
the YP, we specify Υ as the implication of formulating
all items of posterior state in terms of the prior state and
block. To aid the architecting of implementations which
parallelize this computation, we minimize the depth of
the dependency graph where possible. The overall depen-
dency graph is specified here:
τ
H(16)
β
(H, β)(17)
β
(H, E
G
, β
, C)(18)
7
Practically speaking, blockchains sometimes make assumptions of some fraction of participants whose behavior is simply honest, and
not provably incorrect nor otherwise economically disincentivized. While the assumption may be reasonable, it must nevertheless be stated
apart from the rules of state-transition.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 9
γ
(H, τ, E
T
, γ, ι, η
, κ
)(19)
η
(H, τ, η)(20)
κ
(H, τ, κ, γ, ψ
)(21)
λ
(H, τ, λ, κ)(22)
ψ
(E
D
, ψ)(23)
δ
(E
P
, δ, τ
)(24)
ρ
(E
D
, ρ)(25)
ρ
(E
A
, ρ
)(26)
ρ
(E
G
, ρ
, κ, τ
)(27)
δ
χ
ι
φ
C
(E
A
, ρ
, δ
, χ, ι, φ)(28)
α
(H, E
G
, φ
, α)(29)
π
(E
G
, E
P
, E
A
, E
T
, τ, κ
, π, H)(30)
The only synchronous entanglements are visible
through the intermediate components superscripted with
a dagger and defined in equations 17, 24 and 26. The lat-
ter two mark a merge and join in the dependency graph
and, concretely, imply that the preimage lookup extrinsic
must be folded into state before the availability extrinsic
may be fully processed and accumulation of work happen.
4.3. Which History? A blockchain is a sequence of
blocks, each cryptographically referencing some prior
block by including a hash of its header, all the way back
to some first block which references the genesis header.
We already presume consensus over this genesis header
H
0
and the state it represents already defined as σ
0
.
By defining a deterministic function for deriving a sin-
gle posterior state for any (valid) combination of prior
state and block, we are able to define a unique canonical
state for any given block. We generally call the block with
the most ancestors the head and its state the head state.
It is generally possible for two blocks to be valid and yet
reference the same prior block in what is known as a fork.
This implies the possibility of two different heads, each
with their own state. While we know of no way to strictly
preclude this possibility, for the system to be useful we
must nonetheless attempt to minimize it. We therefore
strive to ensure that:
(1) It be generally unlikely for two heads to form.
(2) When two heads do form they be quickly resolved
into a single head.
(3) It be possible to identify a block not much older
than the head which we can be extremely confi-
dent will form part of the blockchain’s history in
perpetuity. When a block becomes identified as
such we call it finalized and this property natu-
rally extends to all of its ancestor blocks.
These goals are achieved through a combination of
two consensus mechanisms: Safrole, which governs the
(not-necessarily forkless) extension of the blockchain; and
Grandpa, which governs the finalization of some extension
into canonical history. Thus, the former delivers point 1,
the latter delivers point 3 and both are important for de-
livering point 2. We describe these portions of the protocol
in detail in sections 6 and 19 respectively.
While Safrole limits forks to a large extent (through
cryptography, economics and common-time, below), there
may be times when we wish to intentionally fork since we
have come to know that a particular chain extension must
be reverted. In regular operation this should never hap-
pen, however we cannot discount the possibility of mali-
cious or malfunctioning nodes. We therefore define such
an extension as any which contains a block in which data
is reported which any other block’s state has tagged as
invalid (see section 10 on how this is done). We further
require that Grandpa not finalize any extension which con-
tains such a block. See section 19 for more information
here.
4.4. Time. We presume a pre-existing consensus over
time specifically for block production and import. While
this was not an assumption of Polkadot, pragmatic and
resilient solutions exist including the ntp protocol and
network. We utilize this assumption in only one way: we
require that blocks be considered temporarily invalid if
their timeslot is in the future. This is specified in detail
in section 6.
Formally, we define the time in terms of seconds passed
since the beginning of the
J
am Common Era, 1200 UTC
on January 1, 2024.
8
Midday CET is selected to ensure
that all significant timezones are on the same date at any
exact 24-hour multiple from the beginning of the common
era. Formally, this value is denoted T .
4.5. Best block. Given the recognition of a number of
valid blocks, it is necessary to determine which should be
treated as the “best” block, by which we mean the most
recent block we believe will ultimately be within of all fu-
ture
J
am chains. The simplest and least risky means of
doing this would be to inspect the Grandpa finality mech-
anism which is able to provide a block for which there is a
very high degree of confidence it will remain an ancestor
to any future chain head.
However, in reducing the risk of the resulting block ul-
timately not being within the canonical chain, Grandpa
will typically return a block some small period older than
the most recently authored block. (Existing deployments
suggest around 1-2 blocks in the past under regular oper-
ation.) There are often circumstances when we may wish
to have less latency at the risk of the returned block not
ultimately forming a part of the future canonical chain.
E.g. we may be in a position of being able to author a
block, and we need to decide what its parent should be.
Alternatively, we may care to speculate about the most
recent state for the purpose of providing information to a
downstream application reliant on the state of
J
am.
In these cases, we define the best block as the head of
the best chain, itself defined in section 19.
4.6. Economics. The present work describes a crypto-
economic system, i.e. one combining elements of both
cryptography and economics and game theory to deliver
a self-sovereign digital service. In order to codify and ma-
nipulate economic incentives we define a token which is
8
1,704,110,400 seconds after the Unix Epoch.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 10
native to the system, which we will simply call tokens in
the present work.
A value of tokens is generally referred to as a balance,
and such a value is said to be a member of the set of bal-
ances, N
B
, which is exactly equivalent to the set of natu-
rals less than 2
64
(i.e. 64-bit unsigned integers in coding
parlance). Formally:
N
B
N
2
64
(31)
Though unimportant for the present work, we presume
that there be a standard named denomination for 10
9
to-
kens. This is different to both Ethereum (which uses a
denomination of 10
18
), Polkadot (which uses a denomina-
tion of 10
10
) and Polkadot’s experimental cousin Kusama
(which uses 10
12
).
The fact that balances are constrained to being less
than 2
64
implies that there may never be more than
around 18×10
9
tokens (each divisible into portions of 10
9
)
within
J
am. We would expect that the total number of
tokens ever issued will be a substantially smaller amount
than this.
We further presume that a number of constant prices
stated in terms of tokens are known. However we leave
the specific values to be determined in following work:
B
I
: the additional minimum balance implied for a
single item within a mapping.
B
L
: the additional minimum balance implied for a
single octet of data within a mapping.
B
S
: the minimum balance implied for a service.
4.7. The Virtual Machine and Gas. In the present
work, we presume the definition of a Polka Virtual Ma-
chine (pvm). This virtual machine is based around
the risc-v instruction set architecture, specifically the
rv32em variant, and is the basis for introducing permis-
sionless logic into our state-transition function.
The pvm is comparable to the evm defined in the Yel-
low Paper, but somewhat simpler: the complex instruc-
tions for cryptographic operations are missing as are those
which deal with environmental interactions. Overall it is
far less opinionated since it alters a pre-existing general
purpose design, risc-v, and optimizes it for our needs.
This gives us excellent pre-existing tooling, since pvm re-
mains essentially compatible with risc-v, including sup-
port from the compiler toolkit llvm and languages such
as Rust and C++. Furthermore, the instruction set sim-
plicity which risc-v and pvm share, together with the
register size (32-bit), active number (13) and endianness
(little) make it especially well-suited for creating efficient
recompilers on to common hardware architectures.
The pvm is fully defined in appendix A, but for contex-
tualization we will briefly summarize the basic invocation
function Ψ which computes the resultant state of a pvm
instance initialized with some registers (N
R
13
) and ram
(M) and has executed for up to some amount of gas (N
G
),
a number of approximately time-proportional computa-
tional steps:
(32) Ψ
Y, N
R
, N
G
,
N
R
13
, M
{, , } {
F
,
h}× N
R
,
N
R
, Z
G
, N
R
13
, M
We refer to the time-proportional computational steps
as gas (much like in the YP) and limit it to a 64-bit quan-
tity. We may use either N
G
or Z
G
to bound it, the first as
a prior argument since it is known to be positive, the latter
as a result where a negative value indicates an attempt to
execute beyond the gas limit. Within the context of the
pvm, ξ N
G
is typically used to denote gas.
(33) Z
G
Z
2
63
...2
63
, N
G
N
2
64
, N
R
N
2
32
It is left as a rather important implementation detail to
ensure that the amount of time taken while computing the
function Ψ(. . . , ξ, . . . ) has a maximum computation time
approximately proportional to the value of ξ regardless of
other operands.
The pvm is a very simple risc register machine and as
such has 13 registers, each of which is a 32-bit quantity,
denoted as N
R
, a natural less than 2
32
.
9
Within the con-
text of the pvm, ω N
R
13
is typically used to denote the
registers.
M
V Y
2
32
,
A
{
W
,
R
,
}
2
32
(34)
The pvm assumes a simple pageable ram of 32-bit
addressable octets where each octet may be either im-
mutable, mutable or inaccessible. The ram definition M
includes two components: a value V and access A. If the
component is unspecified while being subscripted then the
value component may be assumed. Within the context of
the virtual machine, µ M is typically used to denote ram.
V
µ
{i µ
A
[i] } V
µ
{i µ
A
[i]= W}(35)
We define two sets of indices for the ram µ: V
µ
is the
set of indices which may be read from; and V
µ
is the set
of indices which may be written to.
Invocation of the pvm has an exit-reason as the first
item in the resultant tuple. It is either:
Regular program termination caused by an ex-
plicit halt instruction, .
Irregular program termination caused by some ex-
ceptional circumstance, .
Exhaustion of gas, .
A page fault (attempt to access some address in
ram which is not accessible),
F
. This includes the
address at fault.
An attempt at progressing a host-call,
h. This
allows for the progression and integration of a
context-dependent state-machine beyond the reg-
ular pvm.
The full definition follows in appendix A.
4.8. Epochs and Slots. Unlike the YP Ethereum with
its proof-of-work consensus system,
J
am defines a proof-of-
authority consensus mechanism, with the authorized val-
idators presumed to be identified by a set of public keys
and decided by a staking mechanism residing within some
system hosted by
J
am. The staking system is out of scope
for the present work; instead there is an api which may
be utilized to update these keys, and we presume that
whatever logic is needed for the staking system will be
introduced and utilize this api as needed.
The Safrole mechanism subdivides time following gen-
esis into fixed length epochs with each epoch divided into
9
This is three fewer than risc-v’s 16, however the amount that program code output by compilers uses is 13 since two are reserved for
operating system use and the third is fixed as zero
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 11
E = 600 timeslots each of uniform length P = 6 seconds,
given an epoch period of E P = 3600 seconds or one hour.
This six-second slot period represents the minimum
time between
J
am blocks, and through Safrole we aim
to strictly minimize forks arising both due to contention
within a slot (where two valid blocks may be produced
within the same six-second period) and due to contention
over multiple slots (where two valid blocks are produced
in different time slots but with the same parent).
Formally when identifying a timeslot index, we use a
natural less than 2
32
(in compute parlance, a 32-bit un-
signed integer) indicating the number of six-second times-
lots from the
J
am Common Era. For use in this context
we introduce the set N
T
:
N
T
N
2
32
(36)
This implies that the lifespan of the proposed protocol
takes us to mid-August of the year 2840, which with the
current course that humanity is on should be ample.
4.9. The Core Model and Services. Whereas in the
Ethereum Yellow Paper when defining the state machine
which is held in consensus amongst all network partici-
pants, we presume that all machines maintaining the full
network state and contributing to its enlargement—or, at
least, hoping to—evaluate all computation. This “every-
body does everything” approach might be called the on-
chain consensus model. It is unfortunately not scalable,
since the network can only process as much logic in con-
sensus that it could hope any individual node is capable
of doing itself within any given period of time.
4.9.1. In-core Consensus. In the present work, we achieve
scalability of the work done through introducing a sec-
ond model for such computation which we call the in-core
consensus model. In this model, and under normal cir-
cumstances, only a subset of the network is responsible
for actually executing any given computation and assur-
ing the availability of any input data it relies upon to
others. By doing this and assuming a certain amount of
computational parallelism within the validator nodes of
the network, we are able to scale the amount of computa-
tion done in consensus commensurate with the size of the
network, and not with the computational power of any
single machine. In the present work we expect the net-
work to be able to do upwards of 300 times the amount
of computation in-core as that which could be performed
by a single machine running the virtual machine at full
speed.
Since in-core consensus is not evaluated or verified by
all nodes on the network, we must find other ways to be-
come adequately confident that the results of the com-
putation are correct, and any data used in determining
this is available for a practical period of time. We do
this through a crypto-economic game of three stages called
guaranteeing, assuring, auditing and, potentially, judging.
Respectively, these attach a substantial economic cost to
the invalidity of some proposed computation; then a suffi-
cient degree of confidence that the inputs of the computa-
tion will be available for some period of time; and finally,
a sufficient degree of confidence that the validity of the
computation (and thus enforcement of the first guaran-
tee) will be checked by some party who we can expect to
be honest.
All execution done in-core must be reproducible by any
node synchronized to the portion of the chain which has
been finalized. Execution done in-core is therefore de-
signed to be as stateless as possible. The requirements for
doing it include only the refinement code of the service,
the code of the authorizer and any preimage lookups it
carried out during its execution.
When a work-report is presented on-chain, a specific
block known as the lookup-anchor is identified. Cor-
rect behavior requires that this must be in the finalized
chain and reasonably recent, both properties which may
be proven and thus are acceptable for use within a con-
sensus protocol.
We describe this pipeline in detail in the relevant sec-
tions later.
4.9.2. On Services and Accounts. In YP Ethereum, we
have two kinds of accounts: contract accounts (whose ac-
tions are defined deterministically based on the account’s
associated code and state) and simple accounts which act
as gateways for data to arrive into the world state and are
controlled by knowledge of some secret key. In
J
am, all
accounts are service accounts. Like Ethereum’s contract
accounts, they have an associated balance, some code and
state. Since they are not controlled by a secret key, they
do not need a nonce.
The question then arises: how can external data be fed
into the world state of
J
am? And, by extension, how does
overall payment happen if not by deducting the account
balances of those who sign transactions? The answer to
the first lies in the fact that our service definition actually
includes multiple code entry-points, one concerning refine-
ment and the other concerning accumulation. The former
acts as a sort of high-performance stateless processor, able
to accept arbitrary input data and distill it into some much
smaller amount of output data. The latter code is more
stateful, providing access to certain on-chain functionality
including the possibility of transferring balance and invok-
ing the execution of code in other services. Being stateful
this might be said to more closely correspond to the code
of an Ethereum contract account.
To understand how
J
am breaks up its service code is
to understand
J
am’s fundamental proposition of general-
ity and scalability. All data extrinsic to
J
am is fed into
the refinement code of some service. This code is not
executed on-chain but rather is said to be executed in-
core. Thus, whereas the accumulator code is subject to
the same scalability constraints as Ethereum’s contract
accounts, refinement code is executed off-chain and sub-
ject to no such constraints, enabling
J
am services to scale
dramatically both in the size of their inputs and in the
complexity of their computation.
While refinement and accumulation take place in con-
sensus environments of a different nature, both are exe-
cuted by the members of the same validator set. The
J
am
protocol through its rewards and penalties ensures that
code executed in-core has a comparable level of crypto-
economic security to that executed on-chain, leaving the
primary difference between them one of scalability versus
synchroneity.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 12
As for managing payment,
J
am introduces a new ab-
straction mechanism based around Polkadot’s Agile Core-
time. Within the Ethereum transactive model, the mecha-
nism of account authorization is somewhat combined with
the mechanism of purchasing blockspace, both relying on
a cryptographic signature to identify a single “transactor”
account. In
J
am, these are separated and there is no such
concept of a “transactor”.
In place of Ethereum’s gas model for purchasing and
measuring blockspace,
J
am has the concept of coretime,
which is prepurchased and assigned to an authorization
agent. Coretime is analogous to gas insofar as it is the
underlying resource which is being consumed when utiliz-
ing
J
am. Its procurement is out of scope in the present
work and is expected to be managed by a system parachain
operating within a parachains service itself blessed with a
number of cores for running such system services. The au-
thorization agent allows external actors to provide input
to a service without necessarily needing to identify them-
selves as with Ethereum’s transaction signatures. They
are discussed in detail in section 8.
5. The Header
We must first define the header in terms of its com-
ponents. The header comprises a parent hash and prior
state root (H
p
and H
r
), an extrinsic hash H
x
, a time-slot
index H
t
, the epoch, winning-tickets and offenders mark-
ers H
e
, H
w
and H
o
, a Bandersnatch block author index
H
i
and two Bandersnatch signatures; the entropy-yielding
vrf signature H
v
and a block seal H
s
. Headers may be
serialized to an octet sequence with and without the latter
seal component using E and E
U
respectively. Formally:
(37) H
(
H
p
, H
r
, H
x
, H
t
, H
e
, H
w
, H
o
, H
i
, H
v
, H
s
)
Blocks considered invalid by this rule may become valid
as T advances.
The blockchain is a sequence of blocks, each crypto-
graphically referencing some prior block by including a
hash derived from the parent’s header, all the way back to
some first block which references the genesis header. We
already presume consensus over this genesis header H
0
and the state it represents defined as σ
0
.
Excepting the Genesis header, all block headers H have
an associated parent header, whose hash is H
p
. We denote
the parent header H
= P (H):
(38) H
p
H , H
p
H(E(P (H)))
P is thus defined as being the mapping from one block
header to its parent block header. With P , we are able to
define the set of ancestor headers A:
h A h = H (i A h = P (i))(39)
We only require implementations to store headers of
ancestors which were authored in the previous L = 24 hours
of any block B they wish to validate.
The extrinsic hash is the hash of the block’s extrinsic
data. Given any block B = (H, E), then formally:
(40) H
x
H , H
x
H(E(E))
A block may only be regarded as valid once the time-
slot index H
t
is in the past. It is always strictly greater
than that of its parent. Formally:
(41) H
t
N
T
, P (H)
t
< H
t
H
t
P T
The parent state root H
r
is the root of a Merkle trie
composed by the mapping of the prior state’s Merkle root,
which by definition is also the parent block’s posterior
state. This is a departure from both Polkadot and the Yel-
low Paper’s Ethereum, in both of which a block’s header
contains the posterior state’s Merkle root. We do this
to facilitate the pipelining of block computation and in
particular of Merklization.
(42) H
r
H , H
r
M
σ
(σ)
We assume the state-Merklization function M
σ
is ca-
pable of transforming our state σ into a 32-octet commit-
ment. See appendix D for a full definition of these two
functions.
All blocks have an associated public key to identify the
author of the block. We identify this as an index into the
posterior current validator set κ
. We denote the Bander-
snatch key of the author as H
a
though note that this is
merely an equivalence, and is not serialized as part of the
header.
(43) H
i
N
V
, H
a
κ
[H
i
]
5.1. The Markers. If not , then the epoch marker
specifies key and entropy relevant to the following epoch
in case the ticket contest does not complete adequately
(a very much unexpected eventuality). Similarly, the
winning-tickets marker, if not , provides the series of
600 slot sealing “tickets” for the next epoch (see the next
section). Finally, the offenders marker is the sequence of
Ed25519 keys of newly misbehaving validators, to be fully
explained in section 10. Formally:
(44) H
e
H, H
B
V
? , H
w
C
E
? , H
o
H
E
The terms are fully defined in sections 6.6 and 10.
6. Block Production and Chain Growth
As mentioned earlier,
J
am is architected around a hy-
brid consensus mechanism, similar in nature to that of
Polkadot’s Babe/Grandpa hybrid.
J
am’s block produc-
tion mechanism, termed Safrole after the novel Sassafras
production mechanism of which it is a simplified variant, is
a stateful system rather more complex than the Nakamoto
consensus described in the YP.
The chief purpose of a block production consensus
mechanism is to limit the rate at which new blocks may be
authored and, ideally, preclude the possibility of “forks”:
multiple blocks with equal numbers of ancestors.
To achieve this, Safrole limits the possible author of
any block within any given six-second timeslot to a sin-
gle key-holder from within a prespecified set of validators.
Furthermore, under normal operation, the identity of the
key-holder of any future timeslot will have a very high de-
gree of anonymity. As a side effect of its operation, we
can generate a high-quality pool of entropy which may be
used by other parts of the protocol and is accessible to
services running on it.
Because of its tightly scoped role, the core of Safrole’s
state, γ, is independent of the rest of the protocol. It in-
teracts with other portions of the protocol through ι and
κ, the prospective and active sets of validator keys re-
spectively; τ, the most recent block’s timeslot; and η, the
entropy accumulator.
The Safrole protocol generates, once per epoch, a se-
quence of E sealing keys, one for each potential block
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 13
within a whole epoch. Each block header includes its
timeslot index H
t
(the number of six-second periods since
the
J
am Common Era began) and a valid seal signature
H
s
, signed by the sealing key corresponding to the times-
lot within the aforementioned sequence. Each sealing key
is in fact a pseudonym for some validator which was agreed
the privilege of authoring a block in the corresponding
timeslot.
In order to generate this sequence of sealing keys in
regular operation, and in particular to do so without mak-
ing public the correspondence relation between them and
the validator set, we use a novel cryptographic structure
known as a Ringvrf, utilizing the Bandersnatch curve.
Bandersnatch Ringvrf allows for a proof to be provided
which simultaneously guarantees the author controlled a
key within a set (in our case validators), and secondly pro-
vides an output, an unbiasable deterministic hash giving
us a secure verifiable random function (vrf). This anony-
mous and secure random output is a ticket and validators’
tickets with the best score define the new sealing keys al-
lowing the chosen validators to exercise their privilege and
create a new block at the appropriate time.
6.1. Timekeeping. Here, τ defines the most recent
block’s slot index, which we transition to the slot index
as defined in the block’s header:
(45) τ N
T
, τ
H
t
We track the slot index in state as τ in order that we
are able to easily both identify a new epoch and deter-
mine the slot at which the prior block was authored. We
denote e as the prior’s epoch index and m as the prior’s
slot phase index within that epoch and e
and m
are the
corresponding values for the present block:
let e R m =
τ
E
, e
R m
=
τ
E
(46)
6.2. Safrole Basic State. We restate γ into a number
of components:
γ
γ
k
, γ
z
, γ
s
, γ
a
(47)
γ
z
is the epoch’s root, a Bandersnatch ring root com-
posed with the one Bandersnatch key of each of the next
epoch’s validators, defined in γ
k
(itself defined in the next
section).
γ
z
Y
R
(48)
Finally, γ
a
is the ticket accumulator, a series of highest-
scoring ticket identifiers to be used for the next epoch. γ
s
is the current epoch’s slot-sealer series, which is either a
full complement of E tickets or, in the case of a fallback
mode, a series of E Bandersnatch keys:
γ
a
C
E
, γ
s
C
E
H
B
E
(49)
Here, C is used to denote the set of tickets, a combi-
nation of a verifiably random ticket identifier y and the
ticket’s entry-index r:
C
y H, r N
N
(50)
As we state in section 6.4, Safrole requires that every
block header H contain a valid seal H
s
, which is a Ban-
dersnatch signature for a public key at the appropriate
index m of the current epoch’s seal-key series, present in
state as γ
s
.
6.3. Key Rotation. In addition to the active set of val-
idator keys κ and staging set ι, internal to the Safrole state
we retain a pending set γ
k
. The active set is the set of keys
identifying the nodes which are currently privileged to au-
thor blocks and carry out the validation processes, whereas
the pending set γ
k
, which is reset to ι at the beginning of
each epoch, is the set of keys which will be active in the
next epoch and which determine the Bandersnatch ring
root which authorizes tickets into the sealing-key contest
for the next epoch.
ι K
V
, γ
k
K
V
, κ K
V
, λ K
V
(51)
We must introduce K, the set of validator key tuples.
This is a combination of a set of cryptographic public keys
and metadata which is an opaque octet sequence, but uti-
lized to specify practical identifiers for the validator, not
least a hardware address.
The set of validator keys itself is equivalent to the set of
336-octet sequences. However, for clarity, we divide the
sequence into four easily denoted components. For any
validator key k, the Bandersnatch key is denoted k
b
, and
is equivalent to the first 32-octets; the Ed25519 key, k
e
, is
the second 32 octets; the bls key denoted k
BLS
is equiva-
lent to the following 144 octets, and finally the metadata
k
m
is the last 128 octets. Formally:
K Y
336
(52)
k K k
b
H
B
k
0⋅⋅⋅+32
(53)
k K k
e
H
E
k
32⋅⋅⋅+32
(54)
k K k
BLS
Y
BLS
k
64⋅⋅⋅+144
(55)
k K k
m
Y
128
k
208⋅⋅⋅+128
(56)
With a new epoch under regular conditions, validator
keys get rotated and the epoch’s Bandersnatch key root is
updated into γ
z
:
(γ
k
, κ
, λ
, γ
z
)
(Φ(ι), γ
k
, κ, z) if e
> e
(γ
k
, κ, λ, γ
z
) otherwise
(57)
where z = O([k
b
k <
γ
k
])
Φ(k)
[0, 0, . . . ] if k
e
ψ
o
k otherwise
k <
k(58)
Note that on epoch changes the posterior queued val-
idator key set γ
k
is defined such that incoming keys be-
longing to the offenders ψ
o
are replaced with a null key
containing only zeroes. The origin of the offenders is ex-
plained in section 10.
6.4. Sealing and Entropy Accumulation. The header
must contain a valid seal and valid vrf output. These are
two signatures both using the current slot’s seal key; the
message data of the former is the header’s serialization
omitting the seal component H
s
, whereas the latter is
used as a bias-resistant entropy source and thus its mes-
sage must already have been fixed: we use the entropy
stemming from the vrf of the seal signature. Formally:
let i = γ
s
[H
t
]
γ
s
C Ô
i
y
= Y(H
s
),
H
s
F
E
U
(H)
H
a
X
T
η
3
i
r
,
T = 1
(59)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 14
γ
s
H
B
Ô
i = H
a
,
H
s
F
E
U
(H)
H
a
X
F
η
3
,
T = 0
(60)
H
v
F
[]
H
a
X
E
Y(H
s
)(61)
X
E
= $jam_entropy(62)
X
F
= $jam_fallback_seal(63)
X
T
= $jam_ticket_seal(64)
Sealing using the ticket is of greater security, and we
utilize this knowledge when determining a candidate block
on which to extend the chain, detailed in section 19. We
thus note that the block was sealed under the regular se-
curity with the boolean marker T. We define this only for
the purpose of ease of later specification.
In addition to the entropy accumulator η
0
, we retain
three additional historical values of the accumulator at
the point of each of the three most recently ended epochs,
η
1
, η
2
and η
3
. The second-oldest of these η
2
is utilized to
help ensure future entropy is unbiased (see equation 73)
and seed the fallback seal-key generation function with
randomness (see equation 68). The oldest is used to re-
generate this randomness when verifying the seal above
(see equations 60 and 59).
η H
4
(65)
η
0
defines the state of the randomness accumulator to
which the provably random output of the vrf, the signa-
ture over some unbiasable input, is combined each block.
η
1
and η
2
meanwhile retain the state of this accumulator
at the end of the two most recently ended epochs in order.
η
0
H(η
0
Y(H
v
))(66)
On an epoch transition (identified as the condition
e
> e), we therefore rotate the accumulator value into
the history η
1
, η
2
and η
3
:
(η
1
, η
2
, η
3
)
(η
0
, η
1
, η
2
) if e
> e
(η
1
, η
2
, η
3
) otherwise
(67)
6.5. The Slot Key Sequence. The posterior slot key
sequence γ
s
is one of three expressions depending on the
circumstance of the block. If the block is not the first in
an epoch, then it remains unchanged from the prior γ
s
.
If the block signals the next epoch (by epoch index) and
the previous block’s slot was within the closing period of
the previous epoch, then it takes the value of the prior
ticket accumulator γ
a
. Otherwise, it takes the value of
the fallback key sequence. Formally:
γ
s
Z(γ
a
) if e
= e + 1 m Y γ
a
= E
γ
s
if e
= e
F (η
2
, κ
) otherwise
(68)
Here, we use Z as the outside-in sequencer function,
defined as follows:
(69) Z
C
E
C
E
s [s
0
, s
s1
, s
1
, s
s2
, . . . ]
Finally, F is the fallback key sequence function which
selects an epoch’s worth of validator Bandersnatch keys
(H
B
E
) from the validator key set k using the entropy
collected on-chain r:
(70) F
H, K
H
B
E
r, k
k[E
1
(H
4
(r E
4
(i)))]
b
i N
E
6.6. The Markers. The epoch and winning-tickets
markers are information placed in the header in order to
minimize data transfer necessary to determine the valida-
tor keys associated with any given epoch. They are partic-
ularly useful to nodes which do not synchronize the entire
state for any given block since they facilitate the secure
tracking of changes to the validator key sets using only
the chain of headers.
As mentioned earlier, the header’s epoch marker H
e
is
either empty or, if the block is the first in a new epoch,
then a tuple of the epoch randomness and a sequence
of Bandersnatch keys defining the Bandersnatch valida-
tor keys (k
b
) beginning in the next epoch. Formally:
H
e
(η
1
, [k
b
k <
γ
k
]) if e
> e
otherwise
(71)
The winning-tickets marker H
w
is either empty or, if
the block is the first after the end of the submission period
for tickets and if the ticket accumulator is saturated, then
the final sequence of ticket identifiers. Formally:
H
w
Z(γ
a
) if e
= e m < Y m
γ
a
= E
otherwise
(72)
6.7. The Extrinsic and Tickets. The extrinsic E
T
is a
sequence of proofs of valid tickets; a ticket implies an entry
in our epochal “contest” to determine which validators are
privileged to author a block for each timeslot in the follow-
ing epoch. Tickets specify an entry index together with a
proof of ticket’s validity. The proof implies a ticket iden-
tifier, a high-entropy unbiasable 32-octet sequence, which
is used both as a score in the aforementioned contest and
as input to the on-chain vrf.
Towards the end of the epoch (i.e. Y slots from the
start) this contest is closed implying successive blocks
within the same epoch must have an empty tickets extrin-
sic. At this point, the following epoch’s seal key sequence
becomes fixed.
We define the extrinsic as a sequence of proofs of valid
tickets, each of which is a tuple of an entry index (a nat-
ural number less than N) and a proof of ticket validity.
Formally:
E
T
r N
N
, p
¯
F
[]
γ
z
X
T
η
2
r
(73)
E
T
K if m
< Y
0 otherwise
(74)
We define n as the set of new tickets, with the ticket
identifier, a hash, defined as the output component of the
Bandersnatch Ringvrf proof:
n [
y
Y(i
p
), r
i
r
i <
E
T
](75)
The tickets submitted via the extrinsic must already
have been placed in order of their implied identifier. Du-
plicate identifiers are never allowed lest a validator submit
the same ticket multiple times:
n = [x
y
x n](76)
{x
y
x n} {x
y
x γ
a
}(77)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 15
The new ticket accumulator γ
a
is constructed by merg-
ing new tickets into the previous accumulator value (or
the empty sequence if it is a new epoch):
(78)
γ
a
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐ
x
y
x n
if e
> e
γ
a
otherwise
E
The maximum size of the ticket accumulator is E. On
each block, the accumulator becomes the lowest items of
the sorted union of tickets from prior accumulator γ
a
and
the submitted tickets. It is invalid to include useless tick-
ets in the extrinsic, so all submitted tickets must exist in
their posterior ticket accumulator. Formally:
n γ
a
(79)
Note that it can be shown that in the case of an empty
extrinsic E
T
= [], as implied by m
Y, and unchanged
epoch (e
= e), then γ
a
= γ
a
.
7. Recent History
We retain in state information on the most recent H
blocks. This is used to preclude the possibility of dupli-
cate or out of date work-reports from being submitted.
(80) β
h H, b H?, s H, p H
C
H
For each recent block, we retain its header hash, its
state root, its accumulation-result mmr and the hash of
each work-report made into it which is no more than the
total number of cores, C = 341.
During the accumulation stage, a value with the par-
tial transition of this state is provided which contains the
update for the newly-known roots of the parent block:
(81) β
β except β
[β 1]
s
= H
r
We define an item n comprising the new block’s header
hash, its accumulation-result Merkle tree root and the set
of work-reports made into it (for which we use the guar-
antees extrinsic, E
G
). Note that the accumulation-result
tree root r is derived from C (defined in section 12) us-
ing the basic binary Merklization function M
B
(defined
in appendix E) and appending it using the mmr append
function A (defined in appendix E.2) to form a Merkle
mountain range.
(82)
let r = M
B
([s
E
4
(s) E(h)(s, h) C], H
K
)
let b = A(last([[]] [x
b
x <
β]), r, H
K
)
let n =
p
[((g
w
)
s
)
h
g <
E
G
], h
H(H), b, s
H
0
The state-trie root is as being the zero hash, H
0
which
while inaccurate at the end state of the block β
, it is nev-
ertheless safe since β
is not utilized except to define the
next block’s β
, which contains a corrected value for this.
The final state transition is then:
(83) β
ÐÐÐÐ
β
n
H
8. Authorization
We have previously discussed the model of work-
packages and services in section 4.9, however we have yet
to make a substantial discussion of exactly how some core-
time resource may be apportioned to some work-package
and its associated service. In the YP Ethereum model, the
underlying resource, gas, is procured at the point of intro-
duction on-chain and the purchaser is always the same
agent who authors the data which describes the work to
be done (i.e. the transaction). Conversely, in Polkadot the
underlying resource, a parachain slot, is procured with a
substantial deposit for typically 24 months at a time and
the procurer, generally a parachain team, will often have
no direct relation to the author of the work to be done
(i.e. a parachain block).
On a principle of flexibility, we would wish
J
am ca-
pable of supporting a range of interaction patterns both
Ethereum-style and Polkadot-style. In an effort to do so,
we introduce the authorization system, a means of disen-
tangling the intention of usage for some coretime from the
specification and submission of a particular workload to
be executed on it. We are thus able to disassociate the
purchase and assignment of coretime from the specific de-
termination of work to be done with it, and so are able to
support both Ethereum-style and Polkadot-style interac-
tion patterns.
8.1. Authorizers and Authorizations. The authoriza-
tion system involves two key concepts: authorizers and au-
thorizations. An authorization is simply a piece of opaque
data to be included with a work-package. An authorizer
meanwhile, is a piece of pre-parameterized logic which ac-
cepts as an additional parameter an authorization and,
when executed within a vm of prespecified computational
limits, provides a Boolean output denoting the veracity of
said authorization.
Authorizers are identified as the hash of their logic
(specified as the vm code) and their pre-parameterization.
The process by which work-packages are determined to be
authorized (or not) is not the competence of on-chain logic
and happens entirely in-core and as such is discussed in
section 14.3. However, on-chain logic must identify each
set of authorizers assigned to each core in order to ver-
ify that a work-package is legitimately able to utilize that
resource. It is this subsystem we will now define.
8.2. Pool and Queue. We define the set of authorizers
allowable for a particular core c as the authorizer pool
α[c]. To maintain this value, a further portion of state is
tracked for each core: the core’s current authorizer queue
φ[c], from which we draw values to fill the pool. Formally:
(84) α
H
O
C
, φ
H
Q
C
Note: The portion of state φ may be altered only
through an exogenous call made from the accumulate logic
of an appropriately privileged service.
The state transition of a block involves placing a new
authorization into the pool from the queue:
c N
C
α
[c]
ÐÐÐÐÐÐÐÐÐÐÐÐ
F (c) φ
[c][H
t
]
O
(85)
F (c)
α[c]m {(g
w
)
a
} if g E
G
(g
w
)
c
= c
α[c] otherwise
(86)
Since α
is dependent on φ
, practically speaking, this
step must be computed after accumulation, the stage in
which φ
is defined. Note that we utilize the guarantees
extrinsic E
G
to remove the oldest authorizer which has
been used to justify a guaranteed work-package in the
current block. This is further defined in equation 136.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 16
9. Service Accounts
As we already noted, a service in
J
am is somewhat
analogous to a smart contract in Ethereum in that it in-
cludes amongst other items, a code component, a storage
component and a balance. Unlike Ethereum, the code is
split over two isolated entry-points each with their own
environmental conditions; one, refinement, is essentially
stateless and happens in-core, and the other, accumula-
tion, which is stateful and happens on-chain. It is the
latter which we will concern ourselves with now.
Service accounts are held in state under δ, a partial
mapping from a service identifier N
S
into a tuple of named
elements which specify the attributes of the service rele-
vant to the
J
am protocol. Formally:
N
S
N
2
32
(87)
δ DN
S
A(88)
The service account is defined as the tuple of storage
dictionary s, preimage lookup dictionaries p and l, code
hash c, and balance b as well as the two code gas limits g
& m. Formally:
A
s DH Y , p DH Y ,
l D
H, N
L
N
T
3
,
c H , b N
B
, g N
G
, m N
G
(89)
Thus, the balance of the service of index s would be
denoted δ[s]
b
and the storage item of key k H for that
service is written δ[s]
s
[k].
9.1. Code and Gas. The code c of a service account is
represented by a hash which, if the service is to be func-
tional, must be present within its preimage lookup (see
section 9.2). We thus define the actual code c:
a A a
c
a
p
[a
c
] if a
c
a
p
otherwise
(90)
There are three entry-points in the code:
0 refine: Refinement, executed in-core and state-
less.
10
1 accumulate: Accumulation, executed on-chain
and stateful.
2 on_transfer: Transfer handler, executed on-
chain and stateful.
Whereas the first, executing in-core, is described in
more detail in section 14.3, the latter two are defined in
the present section.
As stated in appendix A, execution time in the
J
am
virtual machine is measured deterministically in units of
gas, represented as a natural number less than 2
64
and
formally denoted N
G
. We may also use Z
G
to denote the
set Z
2
63
...2
63
if the quantity may be negative. There are
two limits specified in the account, g, the minimum gas
required in order to execute the Accumulate entry-point
of the service’s code, and m, the minimum required for
the On Transfer entry-point.
9.2. Preimage Lookups. In addition to storing data in
arbitrary key/value pairs available only on-chain, an ac-
count may also solicit data to be made available also in-
core, and thus available to the Refine logic of the service’s
code. State concerning this facility is held under the ser-
vice’s p and l components.
There are several differences between preimage-lookups
and storage. Firstly, preimage-lookups act as a map-
ping from a hash to its preimage, whereas general storage
maps arbitrary keys to values. Secondly, preimage data
is supplied extrinsically, whereas storage data originates
as part of the service’s accumulation. Thirdly preimage
data, once supplied, may not be removed freely; instead
it goes through a process of being marked as unavailable,
and only after a period of time may it be removed from
state. This ensures that historical information on its exis-
tence is retained. The final point especially is important
since preimage data is designed to be queried in-core, un-
der the Refine logic of the service’s code, and thus it is
important that the historical availability of the preimage
is known.
We begin by reformulating the portion of state concern-
ing our data-lookup system. The purpose of this system
is to provide a means of storing static data on-chain such
that it may later be made available within the execution
of any service code as a function accepting only the hash
of the data and its length in octets.
During the on-chain execution of the Accumulate func-
tion, this is trivial to achieve since there is inherently a
state which all validators verifying the block necessarily
have complete knowledge of, i.e. σ. However, for the in-
core execution of Refine, there is no such state inherently
available to all validators; we thus name a historical state,
the lookup anchor which must be considered recently fi-
nalized before the work result may be accumulated hence
providing this guarantee.
By retaining historical information on its availability,
we become confident that any validator with a recently fi-
nalized view of the chain is able to determine whether any
given preimage was available at any time within the period
where auditing may occur. This ensures confidence that
judgements will be deterministic even without consensus
on chain state.
Restated, we must be able to define some historical
lookup function Λ which determines whether the preim-
age of some hash h was available for lookup by some ser-
vice account a at some timeslot t, and if so, provide its
preimage:
(91) Λ
(A, N
H
t
C
D
...H
t
, H) Y?
(a, t, H(p)) v v {p, }
This function is defined shortly below in equation 93.
The preimage lookup for some service of index s is de-
noted δ[s]
p
is a dictionary mapping a hash to its corre-
sponding preimage. Additionally, there is metadata asso-
ciated with the lookup denoted δ[s]
l
which is a dictionary
mapping some hash and presupposed length into historical
information.
9.2.1. Invariants. The state of the lookup system natu-
rally satisfies a number of invariants. Firstly, any preim-
age value must correspond to its hash, equation 92. Sec-
ondly, a preimage value being in state implies that its
hash and length pair has some associated status, also in
10
Technically there is some small assumption of state, namely that some modestly recent instance of each service’s preimages. The
specifics of this are discussed in section 14.3.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 17
equation 92. Formally:
(92) a A, (h p) a
p
h = H(p)
h, p
K(a
l
)
9.2.2. Semantics. The historical status component h
N
T
3
is a sequence of up to three time slots and the
cardinality of this sequence implies one of four modes:
h = []: The preimage is requested, but has not yet
been supplied.
h
N
T
1
: The preimage is available and has been
from time h
0
.
h N
T
2
: The previously available preimage is
now unavailable since time h
1
. It had been avail-
able from time h
0
.
h N
T
3
: The preimage is available and has been
from time h
2
. It had previously been available
from time h
0
until time h
1
.
The historical lookup function Λ may now be defined
as:
(93)
Λ (A, N
T
, H) Y?
Λ(a, t, h)
a
p
[h] if h K(a
p
) I(a
l
[h, a
p
[h]], t)
otherwise
where I(l, t)=
if []= l
x t if [x]= l
x t < y if [x, y]= l
x t < y z t if [x, y, z]= l
9.3. Account Footprint and Threshold Balance. We
define the dependent values i and l as the storage footprint
of the service, specifically the number of items in storage
and the total number of octets used in storage. They are
defined purely in terms of the storage map of a service,
and it must be assumed that whenever a service’s storage
is changed, these change also.
Furthermore, as we will see in the account serialization
function in section C, these are expected to be found ex-
plicitly within the Merklized state data. Because of this
we make explicit their set.
We may then define a second dependent term t, the
minimum, or threshold, balance needed for any given ser-
vice account in terms of its storage footprint.
a V(δ)
a
i
N
2
32
2 a
l
+ a
s
a
l
N
2
64
(h,z)K(a
l
)
81 + z
+
xV(a
s
)
32 + x
a
t
N
B
B
S
+ B
I
a
i
+ B
L
a
l
(94)
9.4. Service Privileges. Up to three services may be
recognized as privileged. The portion of state in which
this is held is denoted χ and has three service index com-
ponents together with a gas limit. The first, χ
m
, is the
index of the manager service which is the service able to
effect an alteration of χ from block to block. The follow-
ing two, χ
a
and χ
v
, are each the indices of services able
to alter φ and ι from block to block.
Finally, χ
g
is a small dictionary containing the indices
of services which automatically accumulate in each block
together with a basic amount of gas with which each ac-
cumulates. Formally:
χ
χ
m
N
S
, χ
a
N
S
, χ
v
N
S
, χ
g
DN
S
N
G
(95)
10. Disputes, Verdicts and Judgements
J
am provides a means of recording judgements: con-
sequential votes amongst most of the validators over the
validity of a work-report (a unit of work done within
J
am,
see section 11). Such collections of judgements are known
as verdicts.
J
am also provides a means of registering of-
fenses, judgements and guarantees which dissent with an
established verdict. Together these form the disputes sys-
tem.
The registration of a verdict is not expected to happen
very often in practice, however it is an important security
backstop for removing and banning invalid work-reports
from the processing pipeline as well as removing trouble-
some keys from the validator set where there is consen-
sus over their malfunction. It also helps coordinate nodes
to revert chain-extensions containing invalid work-reports
and provides a convenient means of aggregating all offend-
ing validators for punishment in a higher-level system.
Judgement statements come about naturally as part
of the auditing process and are expected to be positive,
further affirming the guarantors’ assertion that the work-
report is valid. In the event of a negative judgement, then
all validators audit said work-report and we assume a ver-
dict will be reached. Auditing and guaranteeing are off-
chain processes properly described in sections 14 and 17.
A judgement against a report implies that the chain is
already reverted to some point prior to the accumulation
of said report, usually forking at the block immediately
prior to that at which accumulation happened. The spe-
cific strategy for chain selection is described fully in section
19. Authoring a block with a non-positive verdict has the
effect of cancelling its imminent accumulation, as can be
seen in equation 110.
Registering a verdict also has the effect of placing a
permanent record of the event on-chain and allowing any
offending keys to be placed on-chain both immediately or
in forthcoming blocks, again for permanent record.
Having a persistent on-chain record of misbehavior is
helpful in a number of ways. It provides a very simple
means of recognizing the circumstances under which ac-
tion against a validator must be taken by any higher-level
validator-selection logic. Should
J
am be used for a public
network such as Polkadot, this would imply the slashing of
the offending validator’s stake on the staking parachain.
As mentioned, recording reports found to have a high
confidence of invalidity is important to ensure that said
reports are not allowed to be resubmitted. Conversely,
recording reports found to be valid ensures that additional
disputes cannot be raised in the future of the chain.
10.1.
The State.
The
disputes
state includes four items,
three of which concern verdicts: a good-set (ψ
g
), a bad-
set (ψ
b
) and a wonky-set (ψ
w
) containing the hashes of
all work-reports which were respectively judged to be cor-
rect, incorrect or that it appears impossible to judge. The
fourth item, the punish-set (ψ
o
), is a set of Ed25519 keys
representing validators which were found to have mis-
judged a work-report.
(96) ψ
ψ
g
, ψ
b
, ψ
w
, ψ
o
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 18
10.2. Extrinsic. The disputes extrinsic, E
D
, may con-
tain one or more verdicts v as a compilation of judgements
coming from exactly two-thirds plus one of either the ac-
tive validator set or the previous epoch’s validator set, i.e.
the Ed25519 keys of κ or λ. Additionally, it may con-
tain proofs of the misbehavior of one or more validators,
either by guaranteeing a work-report found to be invalid
(culprits, c), or by signing a judgement found to be con-
tradiction to a work-report’s validity (faults, f ). Both are
considered a kind of offense. Formally:
(97)
E
D
(v, c, f )
where v
H,
τ
E
N
2
,
{, }, N
V
, E
2
/3 V+1
and c H, H
E
, E, f H, {, }, H
E
, E
The signatures of all judgements must be valid in terms
of one of the two allowed validator key-sets, identified by
the verdict’s second term which must be either the epoch
index of the prior state or one less. Formally:
(r, a, j) v, (v, i, s) j s E
k[i]
e
X
v
r
where k =
κ if a =
τ
E
λ otherwise
(98)
X
$jam_valid , X
$jam_invalid(99)
Offender signatures must be similarly valid and refer-
ence work-reports with judgements and may not report
keys which are already in the punish-set:
(r, k, s) c
r ψ
b
,
k k ,
s E
k
X
G
r
(100)
(r, v, k, s) f
r ψ
b
r ψ
g
v ,
k k ,
s E
k
X
v
r
(101)
where k = {k
e
k λ κ} ψ
o
Verdicts v must be ordered by report hash. Offender
signatures c and f must each be ordered by the valida-
tor’s Ed25519 key. There may be no duplicate report
hashes within the extrinsic, nor amongst any past reported
hashes. Formally:
v = [r
r, a, j
v](102)
c = [k
r, k, s
c], f = [k
r, v, k, s
f ](103)
{r
r, a, j
v} ψ
g
ψ
b
ψ
w
(104)
The judgements of all verdicts must be ordered by val-
idator index and there may be no duplicates:
(105) (r, a, j) v j = [i
v, i, s
j]
We define V to derive from the sequence of verdicts
introduced in the block’s extrinsic, containing only the
report hash and the sum of positive judgements. We re-
quire this total to be either exactly two-thirds-plus-one,
zero or one-third of the validator set indicating, respec-
tively, that the report is good, that it’s bad, or that it’s
wonky.
11
Formally:
V
H, {0,
1
3V,
2
3V+ 1}
(106)
V =
r,
v,i,s
j
v
r, a, j
<
v
(107)
There are some constraints placed on the composition
of this extrinsic: any verdict containing solely valid judge-
ments implies the same report having at least one valid en-
try in the faults sequence f. Any verdict containing solely
invalid judgements implies the same report having at least
two valid entries in the culprits sequence c. Formally:
(r,
2
3V
+ 1 ) V (r, . . . ) f(108)
(r, 0) V {(r, . . . ) c} 2(109)
We clear any work-reports which we judged as uncer-
tain or invalid from their core:
(110)
c N
C
ρ
[c]=
if {(H(ρ[c]
r
), t) V, t <
2
3V}
ρ[c] otherwise
The state’s good-set, bad-set and wonky-set assimi-
late the hashes of the reports from each verdict. Finally,
the punish-set accumulates the keys of any validators who
have been found guilty of offending. Formally:
ψ
g
ψ
g
{r
r,
2
3V+ 1
V}(111)
ψ
b
ψ
b
{r
r, 0
V}(112)
ψ
w
ψ
w
{r
r,
1
3V
V}(113)
ψ
o
ψ
o
{k (r, k, s) c} {k (r, v, k, s) f }(114)
10.3. Header. The offenders markers must contain ex-
actly the keys of all new offenders, respectively. Formally:
H
o
[k (r, k, s) c] [k (r, v, k, s) f ](115)
11. Reporting and Assurance
Reporting and assurance are the two on-chain processes
we do to allow the results of in-core computation to make
its way into the service state singleton, δ. A work-package,
which comprises several work items, is transformed by val-
idators acting as guarantors into its corresponding work-
report, which similarly comprises several work outputs and
then presented on-chain within the guarantees extrinsic.
At this point, the work-package is erasure coded into a
multitude of segments and each segment distributed to
the associated validator who then attests to its availabil-
ity through an assurance placed on-chain. After enough
assurances the work-report is considered available, and the
work outputs transform the state of their associated ser-
vice by virtue of accumulation, covered in section 12. The
report may also be timed-out, implying it may be replaced
by another report without accumulation.
From the perspective of the work-report, therefore,
the guarantee happens first and the assurance after-
wards. However, from the perspective of a block’s state-
transition, the assurances are best processed first since
each core may only have a single work-report pending its
package becoming available at a time. Thus, we will first
cover the transition arising from processing the availability
assurances followed by the work-report guarantees. This
synchroneity can be seen formally through the require-
ment of an intermediate state ρ
, utilized later in equation
142.
11
This requirement may seem somewhat arbitrary, but these happen to be the decision thresholds for our three possible actions and
are acceptable since the security assumptions include the requirement that at least two-thirds-plus-one validators are live (Jeff Burdges,
Cevallos, et al. 2024 discusses the security implications in depth).
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 19
11.1. State. The state of the reporting and availability
portion of the protocol is largely contained within ρ, which
tracks the work-reports which have been reported but are
not yet known to be available to a super-majority of val-
idators, together with the time at which each was re-
ported. As mentioned earlier, only one report may be
assigned to a core at any given time. Formally:
(116) ρ
w W, t N
T
?
C
As usual, intermediate and posterior values (ρ
, ρ
, ρ
)
are held under the same constraints as the prior value.
11.1.1. Work Report. A work-report, of the set W , is de-
fined as a tuple of the work-package specification s, the
refinement context x, and the core-index (i.e. on which the
work is done) as well as the authorizer hash a and output
o and finally the results of the evaluation of each of the
items in the package r, which is always at least one item
and may be no more than I items. Formally:
(117) W
s S, x X, c N
C
, a H, o Y, r L
1I
The total serialized size of a work-report may be no
greater than W
R
bytes:
(118) w W E(w) W
R
11.1.2. Refinement Context. A refinement context, de-
noted by the set X, describes the context of the chain at
the point that the report’s corresponding work-package
was evaluated. It identifies two historical blocks, the an-
chor, header hash a along with its associated posterior
state-root s and posterior Beefy root b; and the lookup-
anchor, header hash l and of timeslot t. Finally, it iden-
tifies the hash of an optional prerequisite work-package p.
Formally:
(119) X
a H, s H, b H,
l H, t N
T
, p H?
11.1.3. Availability. We define the set of availability spec-
ifications, S, as the tuple of the work-package’s hash h,
an auditable work bundle length l (see section 14.4.1 for
more clarity on what this is), together with an erasure-
root u and a segment-root e. Work-results include this
availability specification in order to ensure they are able
to correctly reconstruct and audit the purported ramifica-
tions of any reported work-package. Formally:
S
h H, l N
L
, u H, e H
(120)
The erasure-root (u) is the root of a binary Merkle
tree which functions as a commitment to all data required
for the auditing of the report and for use by later work-
packages should they need to retrieve any data yielded. It
is thus used by assurers to verify the correctness of data
they have been sent by guarantors, and it is later verified
as correct by auditors. It is discussed fully in section 14.
The segment-root (e) is the root of a constant-depth,
left-biased and zero-hash-padded binary Merkle tree com-
mitting to the hashes of each of the exported segments
of each work-item. These are used by guarantors to ver-
ify the correctness of any reconstructed segments they are
called upon to import for evaluation of some later work-
package. It is also discussed in section 14.
11.1.4. Work Result. We finally come to define a work re-
sult, L, which is the data conduit by which services’ states
may be altered through the computation done within a
work-package.
L (s N
S
, c H, l H, g N
G
, o Y J)(121)
Work results are a tuple comprising several items.
Firstly s, the index of the service whose state is to be
altered and thus whose refine code was already executed.
We include the hash of the code of the service at the time
of being reported c, which must be accurately predicted
within the work-report according to equation 151;
Next, the hash of the payload (l) within the work item
which was executed in the refine stage to give this result.
This has no immediate relevance, but is something pro-
vided to the accumulation logic of the service. We follow
with the gas prioritization ratio g used when determining
how much gas should be allocated to execute of this item’s
accumulate.
Finally, there is the output or error of the execution of
the code o, which may be either an octet sequence in case
it was successful, or a member of the set J, if not. This
latter set is defined as the set of possible errors, formally:
J {, , BAD, BIG}(122)
The first two are special values concerning execution of
the virtual machine, denoting an out-of-gas error and
denoting an unexpected program termination. Of the
remaining two, the first indicates that the service’s code
was not available for lookup in state at the posterior state
of the lookup-anchor block. The second indicates that
the code was available but was beyond the maximum size
allowed S.
11.2. Package Availability Assurances. We first de-
fine ρ
, the intermediate state to be utilized next in sec-
tion 11.4 as well as W, the set of available work-reports,
which will we utilize later in section 12. Both require the
integration of information from the assurances extrinsic
E
A
.
11.2.1. The Assurances Extrinsic. The assurances extrin-
sic is a sequence of assurance values, at most one per val-
idator. Each assurance is a sequence of binary values (i.e.
a bitstring), one per core, together with a signature and
the index of the validator who is assuring. A value of 1
(or , if interpreted as a Boolean) at any given index im-
plies that the validator assures they are contributing to
its availability.
12
Formally:
E
A
a H, f B
C
, v N
V
, s E
V
(123)
The assurances must all be anchored on the parent and
ordered by validator index:
a E
A
a
a
= H
p
(124)
i {1 . . . E
A
} E
A
[i 1]
v
< E
A
[i]
v
(125)
The signature must be one whose public key is that
of the validator assuring and whose message is the seri-
alization of the parent hash H
p
and the aforementioned
bitstring:
a E
A
a
s
E
κ
[a
v
]
e
X
A
H(H
p
, a
f
)(126)
X
A
$jam_available(127)
12
This is a “soft” implication since there is no consequence on-chain if dishonestly reported. For more information on this implication
see section 16.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 20
A bit may only be set if the corresponding core has a
report pending availability on it:
a E
A
c N
C
a
f
[c] Ô ρ
[c] (128)
11.2.2. Available Reports. A work-report is said to be-
come available if and only if there are a clear
2
/3 super-
majority of validators who have marked its core as set
within the block’s assurance extrinsic. Formally, we de-
fine the series of available work-reports W as:
W
ρ
[c]
w
c <
N
C
,
aE
A
a
f
[c] >
2
3 V
(129)
This value is utilized in the definition of both δ
and ρ
which we will define presently as equivalent to ρ
except
for the removal of items which are now available:
c N
C
ρ
[c]
if ρ[c]
w
W
ρ
[c] otherwise
(130)
11.3. Guarantor Assignments. Every block, each core
has three validators uniquely assigned to guarantee work-
reports for it. This is borne out with V = 1 , 023 validators
and C = 341 cores, since
V
C = 3. The core index assigned to
each of the validators, as well as the validators’ Ed25519
keys are denoted by G:
(131) G (N
C
N
V
, H
K
N
V
)
We determine the core to which any given validator is
assigned through a shuffle using epochal entropy and a
periodic rotation to help guard the security and liveness
of the network. We use η
2
for the epochal entropy rather
than η
1
to avoid the possibility of fork-magnification where
uncertainty about chain state at the end of an epoch could
give rise to two established forks before it naturally re-
solves.
We define the permute function P , the rotation func-
tion R and finally the guarantor assignments G as follows:
R(c, n) [(x + n)mod C x <
c](132)
P (e, t) RF 
C i
V
i <
N
V
, e,
t mod E
R
(133)
G (P (η
2
, τ
), Φ(κ
))(134)
We also define G
, which is equivalent to the value G
as it would have been under the previous rotation:
(135)
let (e, k)=
(η
2
, κ
) if
τ
R
E
=
τ
E
(η
3
, λ
) otherwise
G
(P (e, τ
R ), Φ(k))
11.4. Work Report Guarantees. We begin by defin-
ing the guarantees extrinsic, E
G
, a series of guarantees,
at most one for each core, each of which is a tuple of a
work-report, a credential a and its corresponding timeslot
t. The core index of each guarantee must be unique and
guarantees must be in ascending order of this. Formally:
E
G
w W, t N
T
, a
N
V
, E
23
C
(136)
E
G
= [(g
w
)
c
g E
G
](137)
The credential is a sequence of two or three tuples of a
unique validator index and a signature. Credentials must
be ordered by their validator index:
g E
G
g
a
= [v
v, s
g
a
](138)
The signature must be one whose public key is that of
the validator identified in the credential, and whose mes-
sage is the serialization of the hash of the work-report.
The signing validators must be assigned to the core in
question in either this block G if the timeslot for the guar-
antee is in the same rotation as this block’s timeslot, or
in the most recent previous set of assignments, G
:
(w, t, a) E
G
,
(v, s) a
s E
(k
v
)
E
X
G
H(E(w))
c
v
= w
c
R (
τ
R
1 ) t τ
k R (w, t, a) E
G
, (v, s) a k = (k
v
)
E
where (c, k)=
G if
τ
R
=
t
R
G
otherwise
(139)
X
G
$jam_guarantee(140)
We note that the Ed25519 key of each validator whose
signature is in a credential is placed in the reporters set R.
This is utilized by the validator activity statistics book-
keeping system section 13.
We denote w to be the set of work-reports in the
present extrinsic E:
let w = {g
w
g E
G
}(141)
No reports may be placed on cores with a report pend-
ing availability on it unless it has timed out. In the latter
case, U = 5 slots must have elapsed after the report was
made. A report is valid only if the authorizer hash is
present in the authorizer pool of the core on which the
work is reported. Formally:
(142) w w
ρ
[w
c
]= H
t
ρ
[w
c
]
t
+ U ,
w
a
α[w
c
]
We specify the maximum total accumulation gas re-
quirement a work-report may imply as G
A
, and we require
the sum of all services’ minimum gas requirements to be
no greater than this:
w w
s(w
r
)
s
δ[s]
g
G
A
(143)
11.4.1. Contextual Validity of Reports. For convenience,
we define two equivalences x and p to be, respectively,
the set of all contexts and work-package hashes within
the extrinsic:
(144)
let x {w
x
w w} , p {(w
s
)
h
w w}
There must be no duplicate work-package hashes (i.e.
two work-reports of the same package). Therefore, we
require the cardinality of p to be the length of the work-
report sequence w:
(145) p= w
We require that the anchor block be within the last H
blocks and that its details be correct by ensuring that it
appears within our most recent blocks β:
x x y β x
a
= y
h
x
s
= y
s
x
b
= H
K
(E
M
(y
b
))(146)
We require that each lookup-anchor block be within
the last L timeslots:
x x x
t
H
t
L(147)
We also require that we have a record of it; this is one of
the few conditions which cannot be checked purely with
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 21
on-chain state and must be checked by virtue of retain-
ing the series of the last L headers as the ancestor set A.
Since it is determined through the header chain, it is still
deterministic and calculable. Formally:
x x h A h
t
= x
t
H(h)= x
h
(148)
We require that the work-package of the report not be
the work-package of some other report made in the past.
Since the work-package implies the anchor block, and the
anchor block is limited to the most recent blocks, we need
only ensure that the work-package not appear in our re-
cent history:
(149) p p, x β p x
p
We require that the prerequisite work-package, if
present, be either in the extrinsic or in our recent history:
(150)
w w, (w
x
)
p
(w
x
)
p
p {x x b
p
, b β}
We require that all work results within the extrinsic
predicted the correct code hash for their corresponding
service:
w w, r w
r
r
c
= δ[r
s
]
c
(151)
11.5. Transitioning for Reports. We define ρ
as be-
ing equivalent to ρ
, except where the extrinsic replaced
an entry. In the case an entry is replaced, the new value
includes the present time τ
allowing for the value to be
replaced without respect to its availability once sufficient
time has elapsed (see equation 142).
(152) c N
C
ρ
[c]
w, t
τ
if
c, w, a
E
G
ρ
[c] otherwise
This concludes the section on reporting and assurance.
We now have a complete definition of ρ
together with W
to be utilized in section 12, describing the portion of the
state transition happening once a work-report is guaran-
teed and made available.
12. Accumulation
Accumulation may be defined as some function whose
arguments are W and δ together with selected portions
of (at times partially transitioned) state and which yields
the posterior service state δ
together with additional state
elements ι
, φ
and χ
.
The proposition of accumulation is in fact quite sim-
ple: we merely wish to execute the Accumulate logic of
the service code of each of the services which has at least
one work output, passing to it the work outputs and use-
ful contextual information. However, there are three main
complications. Firstly, we must define the execution envi-
ronment of this logic and in particular the host functions
available to it. Secondly, we must define the amount of
gas to be allowed for each service’s execution. Finally, we
must determine the nature of transfers within Accumu-
late which, as we will see, leads to the need for a second
entry-point, on-transfer.
12.1. Preimage Integration. Prior to accumulation, we
must first integrate all preimages provided in the lookup
extrinsic. The lookup extrinsic is a sequence of pairs of
service indices and data. These pairs must be ordered
and without duplicates (equation 154 requires this). The
data must have been solicited by a service but not yet be
provided. Formally:
E
P
N
S
, Y
(153)
E
P
= [i
i E
P
](154)
s, p
E
P
K(δ[s]
p
) H(p) ,
δ[s]
l
[
H(p), p
] = []
(155)
We define δ
as the state after the integration of the
preimages:
δ
= δ ex.
s, p
E
P
δ
[s]
p
[H(p)]= p
δ
[s]
l
[H(p), p]= [τ
]
(156)
12.2. Gas Accounting. We define S, the set of all ser-
vices which will be accumulated in this block; this is all
services which have at least one work output within W,
together with the privileged services, χ
g
. Formally:
S {r
s
w W, r w
r
} K(()χ
g
)(157)
We calculate the gas attributable for each service as
the sum of each of the service’s work outputs’ share of
their report’s elective accumulation gas together with the
subtotal of minimum gas requirements:
(158) G
N
S
N
G
s
wW
rw
r
,r
s
=s
δ
[s]
g
+
r
g
G
A
rw
r
δ
[r
s
]
g
rw
r
r
g
12.3. Wrangling. We finally define the results which will
be given as an operand into the accumulate function for
each service in S. This is a sequence of operand tuples O,
one sequence for each service in S. Each sequence contains
one element per work output (or error) to be accumulated
for that service, together with said work output’s payload
hash, package hash and authorization output. The tuples
are sequenced in the same order as they appear in W.
Formally:
O
o Y J, l H, k H, a Y
(159)
M
N
S
O
s
o
r
o
, l
r
p
,
a
w
o
, k
(w
s
)
h
w W,
r w
r
,
r
s
= s
(160)
12.4. Invocation. Within this section, we define A, the
function which conducts the accumulation of a single
service. Formally speaking, A assumes omnipresence of
timeslot H
t
and some prior state components δ
, ν, W
d
,
and takes as specific arguments the service index s S
(from which it may derive the wrangled results M (s)and
gas limit G(s)) and yields values for δ
[s]and staging as-
signments into φ, ι together with a series of lookup solici-
tations/forgets, a series of deferred transfers and C map-
ping from service index to Beefy commitment hashes.
We first denote the set of deferred transfers as T, noting
that a transfer includes a memo component m of M = 128
octets, together with the service index of the sender s,
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 22
the service index of the receiver d, the amount of tokens
to be transferred a and the gas limit g for the transfer.
Formally:
T
s N
S
, d N
S
, a N
B
, m Y
M
, g N
G
(161)
We may then define A, the mapping from the index of
accumulated services to the various components in terms
of which we will be imminently defining our posterior
state:
A
N
S
s A?, v K
V
, t T, r H?,
c
H
Q
C
, n D N
S
A,
p
m N
S
, a N
S
, v N
S
, g DN
S
N
G
s Ψ
A
(δ
, s, G(s)+ U((χ
g
)
s
, 0), M(s))
(162)
As can be seen plainly, our accumulation mapping A
combines portions of the prior state into arguments for
a virtual-machine invocation. Specifically the service ac-
counts δ
together with the index of the service in ques-
tion s and its gas limit and wrangled refine-results M (s)
are arranged to create the arguments for Ψ
A
, itself using
a virtual-machine invocation as defined in appendix B.4.
Note that the gas limit is the sum of the regular gas G(s)
together with any privileged gas it receives (χ
g
)
s
.
The Beefy commitment map is a function mapping all
accumulated services to their accumulation result (the r
component of the result of A). This is utilized in deter-
mining the accumulation-result tree root for the present
block, useful for the Beefy protocol:
(163) C {(s, A(s)
r
)s S, A(s)
r
}
Given our mapping A, which may be calculated ex-
haustively from the vm invocations of each accumulated
service S, we may define the posterior state δ
, χ
, φ
and
ι
as the result of integrating A into our state.
12.4.1. Privileged Transitions. The staging core assign-
ments, and validator keys and privileged service set are
each altered based on the effects of the accumulation of
each of the three privileged services:
χ
A(χ
m
)
p
, φ
A(χ
a
)
c
, ι
A(χ
v
)
v
(164)
12.4.2. Service Account Transitions. Finally, we integrate
all changes to the service accounts into state.
We note that all newly added service indices, defined as
K(A(s)
n
)for any accumulated service s, must not conflict
with the indices of existing services or newly added ser-
vices. This should never happen, since new indices are ex-
plicitly selected to avoid such conflicts, but in the unlikely
event it happens, the block would be invalid. Formally:
(165)
s S K(A(s)
n
) K(δ
)= ,
t S {s} K(A(s)
n
) K(A(t)
n
)=
We first define δ
, an intermediate state after main ac-
cumulation but before the transfers have been credited
and handled:
(166)
K(δ
) K(δ
)
sS
K(A(s)
n
) s
s S,
s
s
=
δ
[s]
A(s)
s
if s S
A(t)
n
[s] if !t t S, s K(A(t)
n
)
δ
[s] otherwise
We denote R(s)the sequence of transfers received by a
given service of index s, in order of them being sent from
services of ascending index. (If some service s received
no transfers or simply does not exist then R(s) would be
validly defined as the empty sequence.) Formally:
R
N
S
T
d [t s <
S, t <
A(s)
t
, t
d
= d ]
(167)
The posterior state δ
may then be defined as the inter-
mediate state with all the deferred effects of the transfers
applied:
(168) δ
= {s Ψ
T
(δ
, a, R(a))(s a) δ
}
Note that Ψ
T
is defined in appendix B.5 such that it
results in δ
[d], i.e. no difference to the account’s inter-
mediate state, if R(d) = [], i.e. said account received no
transfers.
13. Validator Activity Statistics
The
J
am chain does not explicitly issue rewards—we
leave this as a job to be done by the staking subsystem
(in Polkadot’s case envisioned as a system parachain—
hosted without fees—in the current imagining of a public
J
am network). However, much as with validator punish-
ment information, it is important for the
J
am chain to
facilitate the arrival of information on validator activity
in to the staking subsystem so that it may be acted upon.
Such performance information cannot directly cover all
aspects of validator activity; whereas block production,
guarantor reports and availability assurance can easily be
tracked on-chain, Grandpa, Beefy and auditing activity
cannot. In the latter case, this is instead tracked with val-
idator voting activity: validators vote on their impression
of each other’s efforts and a median may be accepted as
the truth for any given validator. With an assumption of
50% honest validators, this gives an adequate means of
oraclizing this information.
The validator statistics are made on a per-epoch basis
and we retain one record of completed statistics together
with one record which serves as an accumulator for the
present epoch. Both are tracked in π, which is thus a
sequence of two elements, with the first being the accu-
mulator and the second the previous epoch’s statistics.
For each epoch we track a performance record for each
validator:
(169)
π
b N , t N , p N , d N , g N , a N
V
2
The six statistics we track are:
b: The number of blocks produced by the validator.
t: The number of tickets introduced by the valida-
tor.
p: The number of preimages introduced by the val-
idator.
d: The total number of octets across all preimages
introduced by the validator.
g: The number of reports guaranteed by the valida-
tor.
a: The number of availability assurances made by
the validator.
The objective statistics are updated in line with their
description, formally:
let e =
τ
E
, e
=
τ
E
(170)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 23
(a, π
1
)
(π
0
, π
1
) if e
= e
([
0, . . . , [0, . . . ]
, . . . ], π
0
) otherwise
(171)
v N
V
π
0
[v]
b
a[v]
b
+ (v = H
i
)
π
0
[v]
t
a[v]
t
+
E
T
if v = H
i
0 otherwise
π
0
[v]
p
a[v]
p
+
E
P
if v = H
i
0 otherwise
π
0
[v]
d
a[v]
d
+
dE
P
d if v = H
i
0 otherwise
π
0
[v]
g
a[v]
g
+ (κ
v
R)
π
0
[v]
a
a[v]
a
+ (a E
A
a
v
= v)
(172)
Note that R is the Reporters set, as defined in equation
139.
14. Work Packages and Work Reports
14.1. Honest Behavior. We have so far specified how
to recognize blocks for a correctly transitioning
J
am
blockchain. Through defining the state transition func-
tion and a state Merklization function, we have also de-
fined how to recognize a valid header. While it is not
especially difficult to understand how a new block may be
authored for any node which controls a key which would
allow the creation of the two signatures in the header, nor
indeed to fill in the other header fields, readers will note
that the contents of the extrinsic remain unclear.
We define not only correct behavior through the cre-
ation of correct blocks but also honest behavior, which in-
volves the node taking part in several off-chain activities.
This does have analogous aspects within YP Ethereum,
though it is not mentioned so explicitly in said document:
the creation of blocks along with the gossiping and inclu-
sion of transactions within those blocks would all count as
off-chain activities for which honest behavior is helpful. In
J
am’s case, honest behavior is well-defined and expected
of at least
2
3 of validators.
Beyond the production of blocks, incentivized honest
behavior includes:
the guaranteeing and reporting of work-packages,
along with chunking and distribution of both the
chunks and the work-package itself, discussed in
section 15;
assuring the availability of work-packages after
being in receipt of their data;
determining which work-reports to audit, fetching
and auditing them, and creating and distributing
judgements appropriately based on the outcome
of the audit;
submitting the correct amount of auditing work
seen being done by other validators, discussed in
section 13.
14.2. Segments and the Manifest. Our basic erasure-
coding segment size is W
C
= 684 octets, derived from the
fact we wish to be able to reconstruct even should almost
two-thirds of our 1023 participants be malicious or inca-
pacitated, the 16-bit Galois field on which the erasure-code
is based and the desire to efficiently support encoding data
of close to, but no less than, 4kb.
Work-packages are generally small to ensure guaran-
tors need not invest a lot of bandwidth in order to discover
whether they can get paid for their evaluation into a work-
report. Rather than having much data inline, they instead
reference data through commitments. The simplest com-
mitments are extrinsic data.
Extrinsic data are blobs which are being introduced
into the system alongside the work-package itself gener-
ally by the work-package builder. They are exposed to the
Refine logic as an argument. We commit to them through
including each of their hashes in the work-package.
Work-packages have two other types of external data
associated with them: A cryptographic commitment to
each imported segment and finally the number of segments
which are exported.
14.2.1. Segments, Imports and Exports. The ability to
communicate large amounts of data from one work-
package to some subsequent work-package is a key fea-
ture of the
J
am availability system. An export segment,
defined as the set G, is an octet sequence of fixed length
W
S
W
C
= 4104. It is the smallest datum which may in-
dividually be imported from—or exported to—the long-
term Imports DA during the Refine function of a work-
package. Being an exact multiple of the erasure-coding
piece size ensures that the data segments of work-package
can be efficiently placed in the DA system.
(173) G Y
W
S
W
C
Exported segments are data which are generated
through the execution of the Refine logic and thus are a
side effect of transforming the work-package into a work-
report. Since their data is deterministic based on the exe-
cution of the Refine logic, we do not require any particular
commitment to them in the work-package beyond know-
ing how many are associated with each Refine invocation
in order that we can supply an exact index.
On the other hand, imported segments are segments
which were exported by previous work-packages. In order
for them to be easily fetched and verified they are ref-
erenced not by hash but rather the root of a Merkle tree
which includes any other segments introduced at the time,
together with an index into this sequence. This allows for
justifications of correctness to be generated, stored, in-
cluded alongside the fetched data and verified. This is
described in depth in the next section.
14.2.2. Data Collection and Justification. It is the task of
a guarantor to reconstitute all imported segments through
fetching said segments’ erasure-coded chunks from enough
unique validators. Reconstitution alone is not enough
since corruption of the data would occur if one or more
validators provided an incorrect chunk. For this reason
we ensure that the import segment specification (a Merkle
root and an index into the tree) be a kind of cryptographic
commitment capable of having a justification applied to
demonstrate that any particular segment is indeed correct.
Justification data must be available to any node over
the course of its segment’s potential requirement. At
around 350 bytes to justify a single segment, justification
data is too voluminous to have all validators store all data.
We therefore use the same overall availability framework
for hosting justification metadata as the data itself.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 24
The guarantor is able to use this proof to justify to
themselves that they are not wasting their time on incor-
rect behavior. We do not force auditors to go through
the same process. Instead, guarantors build an Auditable
Work Package, and place this in the Audit DA system.
This is the original work-package, its extrinsic data, its
imported data and a concise proof of correctness of that
imported data. This tactic routinely duplicates data be-
tween the Imports DA and the Audits DA, however it
is acceptable in order to reduce the bandwidth cost for
auditors who must justify the correctness as cheaply as
possible as auditing happens on average 30 times for each
work-package whereas guaranteeing happens only twice or
thrice.
14.3. Packages and Items. We begin by defining a
work-package, of set P, and its constituent work items,
of set I. A work-package includes a simple blob acting as
an authorization token j, the index of the service which
hosts the authorization code h, an authorization code hash
u and a parameterization blob p, a context x and a se-
quence of work items w:
(174) P
j Y, h N
S
, u H,
p Y, x X, w I
1I
A work item includes: s the identifier of the service to
which it relates, the code hash of the service at the time
of reporting c (whose preimage must be available from the
perspective of the lookup anchor block), a payload blob y,
a gas limit g, and the three elements of its manifest, a se-
quence of imported data segments i identified by the root
of the segments tree and an index into it, x, a sequence of
hashed of blob hashes and lengths to be introduced in this
block (and which we assume the validator knows) and e
the number of data segments exported by this work item:
(175) I
s N
S
, c H, y Y, g N
G
,
i
H, N
, x (H, N), e N
We limit the total number of exported items to W
M
=
2
11
. We also place the same limit on the total number of
imported items:
p P
wp
w
w
e
W
M
wp
w
w
i
W
M
(176)
We make an assumption that the preimage to each ex-
trinsic hash in each work-item is known by the guarantor.
In general this data will be passed to the guarantor along-
side the work-package.
We limit the encoded size of a work-package plus the to-
tal size of the implied import and extrinsic items to 12mb
in order to allow for around 2mb/s/core data throughput:
p P
wp
w
w
i
W
S
W
C
+
wp
w
(h,l)w
x
l
W
P
(177)
W
P
= 12 2
20
(178)
Given the result o of some work-item, we define the
item-to-result function C as:
(179) C
(I, Y J ) L
((s, c, y, g), o) (s, c, H(y), g, o)
We define the work-package’s implied authorizer as p
a
,
the hash of the concatenation of the authorization code
and the parameterization. We define the authorization
code as p
c
and require that it be available at the time
of the lookup anchor block from the historical lookup of
service p
h
. Formally:
(180) p P
p
a
H(p
c
p
p
)
p
c
Λ(δ[p
h
], (p
x
)
t
, p
u
)
p
c
Y
(The historical lookup function, Λ, is defined in equa-
tion 93.)
14.3.1. Exporting. Any of a work-package’s work-items
may export segments and a segments-root is placed in the
work-report committing to these, ordered according to the
work-item which is exporting. It is formed as the root of a
constant-depth binary Merkle tree as defined in equation
299.
Guarantors are required to erasure-code and distrib-
ute two data sets: one blob, the auditable work-package
containing the encoded work-package, extrinsic data and
self-justifying imported segments which is placed in the
short-term Audit DA store and a second set of exported-
segments data together with the Paged-Proofs metadata.
Items in the first store are short-lived; assurers are ex-
pected to keep them only until finality of the block which
included the work-result. Items in the second, meanwhile,
are long-lived and expected to be kept for a minimum of
28 days (672 complete epochs) following the reporting of
the work-report.
We define the paged-proofs function P which accepts
a series of exported segments s and defines some series of
additional segments placed into the Imports DA system
via erasure-coding and distribution. The function evalu-
ates to pages of hashes, together with subtree proofs, such
that justifications of correctness based on a segments-root
may be made from it:
(181)
P
G G
s [P
l
(E(J
6
(s, i), s
i⋅⋅⋅+64
))i <
64 N
s
/64
]
where l = W
S
W
C
Note: in the case that s is not a multiple of 64, then
the term s
i⋅⋅⋅+64
will correctly refer to fewer than 64 ele-
ments if it is the final page.
14.4. Computation of Work Results. We now come
to the work result computation function Ξ. This forms
the basis for all utilization of cores on
J
am. It accepts
some work-package p for some nominated core c and re-
sults in either an error or the work result and series
of exported segments. This function is deterministic and
requires only that it be evaluated within eight epochs of
a recently finalized block thanks to the historical lookup
functionality. It can thus comfortably be evaluated by
any node within the auditing period, even allowing for
practicalities of imperfect synchronization.
Formally:
(182) Ξ
(P, N
C
) W
(p, c)
if o Y
a
p
a
, o, x
p
x
, s, r
otherwise
where:
o = Ψ
I
(p, c)
(r, e)=
T
[(C(p
w
[j], r), e)(r, e)= I(p, j), j <
N
p
w
]
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 25
I(p, j) R(p, p
w
[j],
k<j
p
w
[k]
e
)
R(p, w, )
Ψ
R
(w
c
, w
g
, w
s
, H(p), w
y
, p
x
, p
a
, M (w), X(w), )
The definition here is staged over several functions for
ease of reading. The first term to be introduced, o is the
authorization output, the result of the Is-Authorized func-
tion. The second term, (r, e)is the sequence of results for
each of the work-items in the work-package together with
all segments exported by each work-item.
The third and forth definition are helper terms for this,
with the third I performing an ordered accumulation (i.e.
counter) in order to ensure that the Refine function has
access to the total number of exports made from the work-
package up to the current work-item. The fourth term, R,
is essentially just a call to the Refine function, marshalling
the relevant data from the work-package and work-item.
The above relies on two functions, M and X which,
respectively, define the import segment data and the ex-
trinsic data for some work-item argument w:
(183)
M(w I) [s[n](M(s), n)<
w
i
]
X(w I ) [d (H(d), d)<
w
x
]
We may then define s as the data availability specifi-
cation of the package using these two functions together
with the yet to be defined Availability Specifier function
A (see section 14.4.1):
(184)
s = A(H(p), E (p, x, i , j),
e)
where x = [X(w)w <
p
w
]
and i = [M (w)w <
p
w
]
and j = [J (s, n)(M(s), n)<
w
i
, w <
p
w
]
Note that while M and j are both formulated using
the term s (all segments exported by all work-packages
exporting a segment to be imported) such a vast amount
of data is not generally needed as the justification can be
derived through a single paged-proof. This reduces the
worst case data fetching for a guarantor to two segments
for every one to be imported. In the case that contiguously
exported segments are imported (which we might assume
is a fairly common situation), then a single proof-page
should be sufficient to justify many imported segments.
Also of note is the lack of length prefixes: only the
Merkle paths for the justifications (i.e. the elements of j)
have a length prefix. All other sequence lengths are deter-
minable through the work package itself.
The Is-Authorized logic it references is first executed to
ensure that the work-package warrants the needed core-
time. Next, the guarantor should ensure that all segment-
tree roots which form imported segment commitments
are known and have not expired. Finally, the guaran-
tor should ensure that they can fetch all preimage data
referenced as the commitments of extrinsic segments.
Once done, then imported segments must be recon-
structed. This process may in fact be lazy as the Refine
function makes no usage of the data until the import host-
call is made. Fetching generally implies that, for each im-
ported segment, erasure-coded chunks are retrieved from
enough unique validators (342, including the guarantor)
and is described in more depth in appendix H. (Since
we specify systematic erasure-coding, its reconstruction
is trivial in the case that the correct 342 validators are re-
sponsive.) Chunks must be fetched for both the data itself
and for justification metadata which allows us to ensure
that the data is correct.
Validators, in their role as availability assurers, should
index such chunks according to the index of the segments-
tree whose reconstruction they facilitate. Since the data
for segment chunks is so small at 12 bytes, fixed com-
munications costs should be kept to a bare minimum. A
good network protocol (out of scope at present) will al-
low guarantors to specify only the segments-tree root and
index together with a Boolean to indicate whether the
proof chunk need be supplied. Since we assume at least
341 other validators are online and benevolent, we can as-
sume that the guarantor can compute i and j above with
confidence, based on the general availability of data com-
mitted to with s
, which is specified below.
14.4.1. Availability Specifier. We define the availability
specifier function A, which creates an availability spec-
ifier from the package hash, an octet sequence of the
audit-friendly work-package bundle (comprising the work-
package itself, the extrinsic data and the concatenated im-
port segments along with their proofs of correctness), and
the sequence of exported segments:
(185) A
H, Y, G
S
h, b, s
h, l
b, u, e
M(s)
where u = M
B
([
x x <
T
[b
, s
]])
and b
= H
#
(C
b
/W
C
(P
W
C
(b)))
and s
= M
#
B
(
T
C
#
6
(s P (s)))
The paged-proofs function P , defined earlier in equa-
tion 181, accepts a sequence of segments and returns a se-
quence of paged-proofs sufficient to justify the correctness
of every segment. There are exactly
1
64 paged-proof
segments as the number of yielded segments, each com-
posed of a page of 64 hashes of segments, together with
a Merkle proof from the root to the subtree-root which
includes those 64 segments.
The functions M and M
B
are the fixed-depth and sim-
ple binary Merkle root functions, defined in equations
299
and 298. The function C is the erasure-coding function,
defined in appendix H.
And P is the zero-padding function to take an octet
array to some multiple of n in length:
(186) P
nN
1...
Y Y
kn
x x [0, 0, ...]
((∣x+n1) mod n)+1...n
Validators are incentivized to distribute each newly
erasure-coded data chunk to the relevant validator, since
they are not paid for guaranteeing unless a work-report
is considered to be available by a super-majority of val-
idators. Given our work-package p, we should therefore
send the corresponding work-package bundle chunk and
exported segments chunks to each validator whose keys
are together with similarly corresponding chunks for im-
ported, extrinsic and exported segments data, such that
each validator can justify completeness according to the
work-report’s erasure-root. In the case of a coming epoch
change, they may also maximize expected reward by dis-
tributing to the new validator set.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 26
We will see this function utilized in the next sections,
for guaranteeing, auditing and judging.
15. Guaranteeing
Guaranteeing work-packages involves the creation and
distribution of a corresponding work-report which requires
certain conditions to be met. Along with the report, a sig-
nature demonstrating the validator’s commitment to its
correctness is needed. With two guarantor signatures, the
work-report may be distributed to the forthcoming
J
am
chain block author in order to be used in the E
G
, which
leads to a reward for the guarantors.
We presume that in a public system, validators will be
punished severely if they malfunction and commit to a
report which does not faithfully represent the result of Ξ
applied on a work-package. Overall, the process is:
(1) Evaluation of the work-package’s authorization,
and cross-referencing against the authorization
pool in the most recent
J
am chain state.
(2) Creation and publication of a work-package re-
port.
(3) Chunking of the work-package and each of its ex-
trinsic and exported data, according to the era-
sure codec.
(4) Distributing the aforementioned chunks across
the validator set.
(5) Providing the work-package, extrinsic and ex-
ported data to other validators on request is also
helpful for optimal network performance.
For any work-package p we are in receipt of, we may
determine the work-report, if any, it corresponds to for
the core c that we are assigned to. When
J
am chain state
is needed, we always utilize the chain state of the most
recent block.
For any guarantor of index v assigned to core c and a
work-package p, we define the work-report r simply as:
(187) r = Ξ(p, c)
Such guarantors may safely create and distribute the
payload (s, v). The component s may be created accord-
ing to equation 139; specifically it is a signature using the
validator’s registered Ed25519 key on a payload l:
(188) l = H(c, r)
To maximize profit, the guarantor should require the
work result meets all expectations which are in place dur-
ing the guarantee extrinsic described in section 11.4. This
includes contextual validity, inclusion of the authorization
in the authorization pool, and ensuring total gas is at most
G
A
. No doing so does not result in punishment, but will
prevent the block author from including the package and
so reduces rewards.
Advanced nodes may maximize the likelihood that their
reports will be includable on-chain by attempting to pre-
dict the state of the chain at the time that the report will
get to the block author. Naive nodes may simply use the
current chain head when verifying the work-report. To
minimize work done, nodes should make all such evalua-
tions prior to evaluating the Ψ
R
function to calculate the
report’s work results.
Once evaluated as a reasonable work-package to guar-
antee, guarantors should maximize the chance that their
work is not wasted by attempting to form consensus over
the core. To achieve this they should send the work-
package to any other guarantors on the same core which
they do not believe already know of it.
In order to minimize the work for block authors and
thus maximize expected profits, guarantors should at-
tempt to construct their core’s next guarantee extrinsic
from the work-report, core index and set of attestations
including their own and as many others as possible.
In order to minimize the chance of any block authors
disregarding the guarantor for anti-spam measures, guar-
antors should sign an average of no more than two work-
reports per timeslot.
16. Availability Assurance
Validators should issue a signed statement, called an
assurance, when they are in possession of all of their cor-
responding erasure-coded chunks for a given work-report
which is currently pending availability. For any work-
report to gain an assurance, there are two classes of data
a validator must have:
Firstly, their erasure-coded chunk for this report’s bun-
dle. The validity of this chunk can be trivially proven
through the work-report’s work-package erasure-root and
a Merkle-proof of inclusion in the correct location. The
proof should be included from the guarantor. This chunk
is needed to verify the work-report’s validity and com-
pleteness and need not be retained after the work-report
is considered audited. Until then, it should be provided
on request to validators.
Secondly, the validator should have in hand the cor-
responding erasure-coded chunk for each of the exported
segments referenced by the segments root. These should
be retained for 28 days and provided to any validator on
request.
17. Auditing and Judging
The auditing and judging system is theoretically equiv-
alent to that in Elves, introduced by Jeff Burdges, Ceval-
los, et al. 2024. For a full security analysis of the mecha-
nism, see this work. There is a difference in terminology,
where the terms backing, approval and inclusion there re-
fer to our guaranteeing, auditing and accumulation, re-
spectively.
17.1. Overview. The auditing process involves each
node requiring themselves to fetch, evaluate and issue
judgement on a random but deterministic set of work-
reports from each
J
am chain block in which the work-
report becomes available (i.e. from W). Prior to any eval-
uation, a node declares and proves its requirement. At
specific common junctures in time thereafter, the set of
work-reports which a node requires itself to evaluate from
each block’s W may be enlarged if any declared intentions
are not matched by a positive judgement in a reasonable
time or in the event of a negative judgement being seen.
These enlargement events are called tranches.
If all declared intentions for a work-report are matched
by a positive judgement at any given juncture, then the
work-report is considered audited. Once all of any given
block’s newly available work-reports are audited, then we
consider the block to be audited. One prerequisite of a
node finalizing a block is for it to view the block as au-
dited. Note that while there will be eventual consensus on
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 27
whether a block is audited, there may not be consensus
at the time that the block gets finalized. This does not
affect the crypto-economic guarantees of this system.
In regular operation, no negative judgements will ulti-
mately be found for a work-report, and there will be no
direct consequences of the auditing stage. In the unlikely
event that a negative judgement is found, then one of sev-
eral things happens; if there are still more than
2
3V pos-
itive judgements, then validators issuing negative judge-
ments may receive a punishment for time-wasting. If there
are greater than
1
3V negative judgements, then the block
which includes the work-report is ban-listed. It and all
its descendants are disregarded and may not be built on.
In all cases, once there are enough votes, a judgement ex-
trinsic can be constructed by a block author and placed
on-chain to denote the outcome. See section 10 for details
on this.
All announcements and judgements are published to
all validators along with metadata describing the signed
material. On receipt of sure data, validators are expected
to update their perspective accordingly (later defined as
J and A).
17.2. Data Fetching. For each work-report to be au-
dited, we use its erasure-root to request erasure-coded
chunks from enough assurers. From each assurer we fetch
three items (which with a good network protocol should be
done under a single request) corresponding to the work-
package super-chunks, the self-justifying imports super-
chunks and the extrinsic segments super-chunks.
We may validate the work-package reconstruction by
ensuring its hash is equivalent to the hash includes as
part of the work-package specification in the work-report.
We may validate the extrinsic segments through ensur-
ing their hashes are each equivalent to those found in the
relevant work-item.
Finally, we may validate each imported segment as a
justification must follow the concatenated segments which
allows verification that each segment’s hash is included in
the referencing Merkle root and index of the correspond-
ing work-item.
Exported segments need not be reconstructed in the
same way, but rather should be determined in the same
manner as with guaranteeing, i.e. through the execution
of the Refine logic.
All items in the work-package specification field of the
work-report should be recalculated from this now known-
good data and verified, essentially retracing the guaran-
tors steps and ensuring correctness.
17.3. Selection of Reports. Each validator shall per-
form auditing duties on each valid block received. Since we
are entering off-chain logic, and we cannot assume consen-
sus, we henceforth consider ourselves a specific validator
of index v and assume ourselves focused on some block
B with other terms corresponding, so σ
is said block’s
posterior state, H is its header &c. Practically, all con-
siderations must be replicated for all blocks and multiple
blocks’ considerations may be underway simultaneously.
We define the sequence of work-reports which we may
be required to audit as Q, a sequence of length equal to
the number of cores, which functions as a mapping of core
index to a work-report pending which has just become
available, or if no report became available on the core.
Formally:
Q W?
C
(189)
Q
ρ[c]
w
if ρ[c]
w
W
otherwise
c <
N
C
(190)
We define our initial audit tranche in terms of a verifi-
able random quantity s
0
created specifically for it:
s
0
F
[]
κ[v]
b
X
U
Y(H
v
)(191)
X
U
= $jam_audit(192)
We may then define a
0
as the non-empty items to audit
through a verifiably random selection of ten cores:
a
0
= {
c, w
c, w
p
⋅⋅⋅+10
, w }(193)
where p = F ([
c, Q
c
c <
N
C
], r)(194)
and r = Y(s
0
)(195)
Every A = 8 seconds following a new time slot, a new
tranche begins, and we may determine that additional
cores warrant an audit from us. Such items are defined
as a
n
where n is the current tranche. Formally:
(196) let n =
T P τ
A
New tranches may contain items from Q stemming
from one of two reasons: either a negative judgement has
been received; or the number of judgements from the pre-
vious tranche is less than the number of announcements
from said tranche. In the first case, the validator is al-
ways required to issue a judgement on the work-report.
In the second case, a new special-purpose vrf must be
constructed to determine if an audit and judgement is
warranted from us.
In all cases, we publish a signed statement of which
of the cores we believe we are required to audit (an an-
nouncement) together with evidence of the vrf signature
to select them and the other validators’ announcements
from the previous tranche unmatched with a judgement in
order that all other validators are capable of verifying the
announcement. Publication of an announcement should be
taken as a contract to complete the audit regardless of any
future information.
Formally, for each tranche n we ensure the announce-
ment statement is published and distributed to all other
validators along with our validator index v, evidence s
n
and all signed data. Validator’s announcement statements
must be in the set:
E
κ[v]
e
X
I
n E ([E
2
(c) H(w)
c, w
a
0
])(197)
X
I
= $jam_announce(198)
We define A
n
as our perception of which validator is
required to audit each of the work-reports (identified by
their associated core) at tranche n. This comes from each
other validators’ announcements (defined above). It can-
not be correctly evaluated until n is current. We have
absolute knowledge about our own audit requirements.
A
n
W N
V
(199)
(c, w) a
0
v q
0
(w)(200)
We further define J
and J
to be the validator indices
who we know to have made respectively, positive and neg-
ative, judgements mapped from each work-report’s core.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 28
We don’t care from which tranche a judgement is made.
J
{,}
W N
V
(201)
We are able to define a
n
for tranches beyond the first
on the basis of the number of validators who we know are
required to conduct an audit yet from whom we have not
yet seen a judgement. It is possible that the late arrival
of information alters a
n
and nodes should reevaluate and
act accordingly should this happen.
We can thus define a
n
beyond the initial tranche
through a new vrf which acts upon the set of no-show
validators.
n > 0
s
n
(w) F
[]
κ[v]
b
X
U
Y(H
v
) H(w) n(202)
a
n
{w Q
V
256F
Y(s
n
(w))
0
< A
n1
(w) J
(w)}(203)
We define our bias factor F = 2, which is the expected
number of validators which will be required to issue a
judgement for a work-report given a single no-show in the
tranche before. Modeling by Jeff Burdges, Cevallos, et al.
2024 shows that this is optimal.
Later audits must be announced in a similar fashion
to the first. If audit requirements lessen on the receipt
of new information (i.e. a positive judgement being re-
turned for a previous no-show), then any audits already
announced are completed and judgements published. If
audit requirements raise on the receipt of new informa-
tion (i.e. an additional announcement being found with-
out an accompanying judgement), then we announce the
additional audit(s) we will undertake.
As n increases with the passage of time a
n
becomes
known and defines our auditing responsibilities. We must
attempt to reconstruct all work-packages and their requi-
site data corresponding to each work-report we must au-
dit. This may be done through requesting erasure-coded
chunks from one-third of the validators. It may also be
short-cutted through asking a cooperative third-party (e.g.
an original guarantor) for the preimages.
Thus, for any such work-report w we are assured we
will be able to fetch some candidate work-package encod-
ing F (w)which comes either from reconstructing erasure-
coded chunks verified through the erasure coding’s Merkle
root, or alternatively from the preimage of the work-
package hash. We decode this candidate blob into a work-
package.
In addition to the work-package, we also assume we are
able to fetch all manifest data associated with it through
requesting and reconstructing erasure-coded chunks from
one-third of validators in the same way as above.
We then attempt to reproduce the report on the core
to give e
n
, a mapping from cores to evaluations:
(204)
(c, w) a
n
e
n
(w)
w = Ξ(p, c) if p P E(p)= F (w)
otherwise
Note that a failure to decode implies an invalid work-
report.
From this mapping the validator issues a set of judge-
ments j
n
:
j
n
= {S
κ[v]
e
(X
e(w)
H(w))(c, w) a
n
}(205)
All judgements j
should be published to other valida-
tors in order that they build their view of J and in the
case of a negative judgement arising, can form an extrinsic
for E
D
.
We consider a work-report as audited under two cir-
cumstances. Either, when it has no negative judgements
and there exists some tranche in which we see a positive
judgement from all validators who we believe are required
to audit it; or when we see positive judgements for it from
greater than two-thirds of the validator set.
U(w)
J
(w)= n A
n
(w) J
(w)
J
(w)>
2
3V
(206)
Our block B may be considered audited, a condition
denoted U, when all the work-reports which were made
available are considered audited. Formally:
U w W U (w)(207)
For any block we must judge it to be audited (i.e.
U = ) before we vote for the block to be finalized in
Grandpa. See section 19 for more information here.
Furthermore, we pointedly disregard chains which in-
clude the accumulation of a report which we know at least
1
3 of validators judge as being invalid. Any chains includ-
ing such a block are not eligible for authoring on. The best
block, i.e. that on which we build new blocks, is defined as
the chain with the most regular Safrole blocks which does
not contain any such disregarded block. Implementation-
wise, this may require reversion to an earlier head or al-
ternative fork.
As a block author, we include a judgement extrinsic
which collects judgement signatures together and reports
them on-chain. In the case of a non-valid judgement (i.e.
one which is not two-thirds-plus-one of judgements con-
firming validity) then this extrinsic will be introduced in a
block in which accumulation of the non-valid work-report
is about to take place. The non-valid judgement extrin-
sic removes it from the pending work-reports, ρ. Refer to
section 10 for more details on this.
18. Beefy Distribution
For each finalized block B which a validator imports,
said validator shall make a bls signature on the bls12-381
curve, as defined by Hopwood et al. 2020, affirming the
Keccak hash of the block’s most recent Beefy mmr. This
should be published and distributed freely, along with the
signed material. These signatures may be aggregated in
order to provide concise proofs of finality to third-party
systems. The signing and aggregation mechanism is de-
fined fully by Jeff Burdges, Ciobotaru, et al. 2022.
Formally, let F
v
be the signed commitment of validator
index v which will be published:
F
v
S
κ
v
(X
B
H
K
(E
M
(last(β)
b
]))(208)
X
B
= $jam_beefy(209)
19. Grandpa and the Best Chain
Nodes take part in the Grandpa protocol as defined
by Stewart and Kokoris-Kogia 2020.
We define the latest finalized block as B
. All associ-
ated terms concerning block and state are similarly super-
scripted. We consider the best block, B
to be that which
is drawn from the set of acceptable blocks of the following
criteria:
Has the finalized block as an ancestor.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 29
Contains no unfinalized blocks where we see an
equivocation (two valid blocks at the same times-
lot).
Is considered audited.
Formally:
A(H
) H
(210)
U
(211)
H
A
, H
B
H
A
H
B
H
A
T
= H
B
T
H
A
A(H
)
H
A
A(H
)
(212)
Of these acceptable blocks, that which contains the
most ancestor blocks whose author used a seal-key ticket,
rather than a fallback key should be selected as the best
head, and thus the chain on which the participant should
make Grandpa votes.
Formally, we aim to select B
to maximize the value m
where:
(213) m =
H
A
A
T
A
20. Discussion
20.1. Technical Characteristics. In total, with our
stated target of 1,023 validators and three validators per
core, along with requiring a mean of ten audits per val-
idator per timeslot, and thus 30 audits per work-report,
J
am is capable of trustlessly processing and integrating
341 work-packages per timeslot.
We assume node hardware is a modern 16 core cpu
with 64gb ram, 1tb secondary storage and 0.5gbe net-
working.
Our performance models assume a rough split of cpu
time as follows:
Proportion
Audits
10
16
Merklization
1
16
Block execution
2
16
Grandpa and Beefy
1
16
Erasure coding
1
16
Networking & misc
1
16
Estimates for network bandwidth requirements are as
follows:
Upload Download
mb/s mb/s
Guaranteeing 30 40
Assuring 60 56
Auditing 200 200
Block publication 42 42
Grandpa and Beefy 4 4
Total 336 342
Thus, a connection able to sustain 500mb/s should
leave a sufficient margin of error and headroom to serve
other validators as well as some public connections, though
the burstiness of block publication would imply validators
are best to ensure that peak bandwidth is higher.
Under these conditions, we would expect an overall
network-provided data availability capacity of 2pb, with
each node dedicating at most 6tb to availability storage.
Estimates for memory usage are as follows:
gb
Auditing 20 2 × 10 pvm instances
Block execution 2 1 pvm instance
State cache 40
Misc 2
Total 64
As a rough guide, each parachain has an average foot-
print of around 2mb in the Polkadot Relay chain; a 40gb
state would allow 20,000 parachains’ information to be
retained in state.
What might be called the “virtual hardware” of a
J
am
core is essentially a regular cpu core executing at some-
where between 25% and 50% of regular speed for the
whole six-second portion and which may draw and pro-
vide 2.5mb/s average in general-purpose i/o and utilize up
to 2gb in ram. The i/o includes any trustless reads from
the
J
am chain state, albeit in the recent past. This virtual
hardware also provides unlimited reads from a semi-static
preimage-lookup database.
Each work-package may occupy this hardware and exe-
cute arbitrary code on it in six-second segments to create
some result of at most 90kb. This work result is then
entitled to 10ms on the same machine, this time with no
“external” i/o beyond said result, but instead with full
and immediate access to the
J
am chain state and may
alter the service(s) to which the results belong.
20.2. Illustrating Performance. In terms of pure pro-
cessing power, the
J
am machine architecture can deliver
extremely high levels of homogeneous trustless computa-
tion. However, the core model of
J
am is a classic paral-
lelized compute architecture, and for solutions to be able
to utilize the architecture well they must be designed with
it in mind to some extent. Accordingly, until such use-
cases appear on
J
am with similar semantics to existing
ones, it is very difficult to make direct comparisons to ex-
isting systems. That said, if we indulge ourselves with
some assumptions then we can make some crude compar-
isons.
20.2.1. Comparison to Polkadot. Pre-asynchronous back-
ing, Polkadot validates around 50 parachains, each one
utilizing approximately 250ms of native computation (i.e.
half a second of Wasm execution time at around a 50%
overhead) and 5mb of i/o for every twelve seconds of
real time which passes. This corresponds to an aggregate
compute performance of around parity with a native cpu
core and a total 24-hour distributed availability of around
20mb/s. Accumulation is beyond Polkadot’s capabilities
and so not comparable.
Post asynchronous-backing and estimating that Polka-
dot is at present capable of validating at most 80
parachains each doing one second of native computation
in every six, then the aggregate performance is increased
to around 13x native cpu and the distributed availability
increased to around 67mb/s.
For comparison, in our basic models,
J
am should be
capable of attaining around 85x the computation load of
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 30
a single native cpu core and a distributed availability of
852mb/s.
20.2.2. Simple Transfers. We might also attempt to
model a simple transactions-per-second amount, with each
transaction requiring a signature verification and the mod-
ification of two account balances. Once again, until there
are clear designs for precisely how this would work we must
make some assumptions. Our most naive model would be
to use the
J
am cores (i.e. refinement) simply for trans-
action verification and account lookups. The
J
am chain
would then hold and alter the balances in its state. This
is unlikely to give great performance since almost all the
needed i/o would be synchronous, but it can serve as a
basis.
A 15mb work-package can hold around 125k transac-
tions at 128 bytes per transaction. However, a 90kb work-
result could only encode around 11k account updates when
each update is given as a pair of a 4 byte account index
and 4 byte balance, resulting in a limit of 5.5k transac-
tions per package, or 312k tps in total. It is possible that
the eight bytes could typically be compressed by a byte
or two, increasing maximum throughput a little. Our ex-
pectations are that state updates, with highly parallelized
Merklization, can be done at between 500k and 1 million
reads/write per second, implying around 250k-350k tps,
depending on which turns out to be the bottleneck.
A more sophisticated model would be to use the
J
am
cores for balance updates as well as transaction verifica-
tion. We would have to assume that state and the trans-
actions which operate on them can be partitioned between
work-packages with some degree of efficiency, and that the
15mb of the work-package would be split between transac-
tion data and state witness data. Our basic models predict
that a 4bn 32-bit account system paginated into 2
10
ac-
counts/page and 128 bytes per transaction could, assum-
ing only around 1% of oraclized accounts were useful, av-
erage upwards of 1.7mtps depending on partitioning and
usage characteristics. Partitioning could be done with a
fixed fragmentation (essentially sharding state), a rotating
partition pattern or a dynamic partitioning (which would
require specialized sequencing).
Interestingly, we expect neither model to be bottle-
necked in computation, meaning that transactions could
be substantially more sophisticated, perhaps with more
flexible cryptography or smart contract functionality,
without a significant impact on performance.
20.2.3. Computation Throughput. The tps metric does
not lend itself well to measuring distributed systems’ com-
putational performance, so we now turn to another slightly
more compute-focussed benchmark: the evm. The basic
YP Ethereum network, now approaching a decade old, is
probably the best known example of general purpose de-
centralized computation and makes for a reasonable yard-
stick. It is able to sustain a computation and i/o rate of
1.25M gas/sec, with a peak throughput of twice that. The
evm gas metric was designed to be a time-proportional
metric for predicting and constraining program execution.
Attempting to determine a concrete comparison to pvm
throughput is non-trivial and necessarily opinionated ow-
ing to the disparity between the two platforms including
word size, endianness and stack/register architecture and
memory model. However, we will attempt to determine a
reasonable range of values.
Evm gas does not directly translate into native exe-
cution as it also combines state reads and writes as well
as transaction input data, implying it is able to process
some combination of up to 595 storage reads, 57 storage
writes and 1.25M gas as well as 78kb input data in each
second, trading one against the other.
13
We cannot find
any analysis of the typical breakdown between storage i/o
and pure computation, so to make a very conservative es-
timate, we assume it does all four. In reality, we would
expect it to be able to do on average
1
/4 of each.
Our experiments
14
show that on modern, high-end con-
sumer hardware with a modern evm implementation, we
can expect somewhere between 100 and 500 gas/µs in
throughput on pure-compute workloads (we specifically
utilized Odd-Product, Triangle-Number and several im-
plementations of the Fibonacci calculation). To make a
conservative comparison to pvm, we propose transcom-
pilation of the evm code into pvm code and then re-
execution of it under the Polkavm prototype.
15
To help estimate a reasonable lower-bound of evm
gas/µs, e.g. for workloads which are more memory and
i/o intensive, we look toward real-world permissionless
deployments of the evm and see that the Moonbeam
network, after correcting for the slowdown of execut-
ing within the recompiled WebAssembly platform on the
somewhat conservative Polkadot hardware platform, im-
plies a throughput of around 100 gas/µs. We therefore
assert that in terms of computation, 1µs evm gas approx-
imates to around 100-500 gas on modern high-end con-
sumer hardware.
16
Benchmarking and regression tests show that the pro-
totype pvm engine has a fixed preprocessing overhead of
around 5ns/byte of program code and, for arithmetic-
heavy tasks at least, a marginal factor of 1.6-2% com-
pared to evm execution, implying an asymptotic speedup
of around 50-60x. For machine code 1mb in size expected
to take of the order of a second to compute, the com-
pilation cost becomes only 0.5% of the overall time.
17
For code not inherently suited to the 256-bit evm isa,
we would expect substantially improved relative execu-
tion times on pvm, though more work must be done in
order to gain confidence that these speed-ups are broadly
applicable.
If we allow for preprocessing to take up to the same
component within execution as the marginal cost (owing
13
The latest “proto-danksharding” changes allow it to accept 87.3kb/s in committed-to data though this is not directly available within
state, so we exclude it from this illustration, though including it with the input data would change the results little.
14
This is detailed at https://hackmd.io/@XXX9CM1uSSCWVNFRYaSB5g/HJarTUhJA and intended to be updated as we get more information.
15
It is conservative since we don’t take into account that the source code was originally compiled into evm code and thus the pvm
machine code will replicate architectural artifacts and thus is very likely to be pessimistic. As an example, all arithmetic operations in evm
are 256-bit and 32-bit native pvm is being forced to honor this even if the source code only actually required 32-bit values.
16
We speculate that the substantial range could possibly be caused in part by the major architectural differences between the evm isa
typical modern hardware.
17
As an example, our odd-product benchmark, a very much pure-compute arithmetic task, execution takes 58s on evm, and 1.04s within
our pvm prototype, including all preprocessing.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 31
to, for example, an extremely large but short-running pro-
gram) and for the pvm metering to imply a safety overhead
of 2x to execution speeds, then we can expect a
J
am core
to be able to process the equivalent of around 1,500 evm
gas/µs. Owing to the crudeness of our analysis we might
reasonably predict it to be somewhere within a factor of
three either way—i.e. 500-5,000 evm gas/µs.
J
am cores are each capable of 2.5mb/s bandwidth,
which must include any state i/o and data which must be
newly introduced (e.g. transactions). While writes come
at comparatively little cost to the core, only requiring
hashing to determine an eventual updated Merkle root,
reads must be witnessed, with each one costing around
640 bytes of witness conservatively assuming a one-million
entry binary Merkle trie. This would result in a maxi-
mum of a little under 4k reads/second/core, with the ex-
act amount dependent upon how much of the bandwidth
is used for newly introduced input data.
Aggregating everything across
J
am, excepting accu-
mulation which could add further throughput, numbers
can be multiplied by 341 (with the caveat that each one’s
computation cannot interfere with any of the others’ ex-
cept through state oraclization and accumulation). Unlike
for roll-up chain designs such as Polkadot and Ethereum,
there is no need to have persistently fragmented state.
Smart-contract state may be held in a coherent format on
the
J
am chain so long as any updates are made through
the 15kb/core/sec work results, which would need to con-
tain only the hashes of the altered contracts’ state roots.
Under our modelling assumptions, we can therefore
summarize:
Eth. L1
J
am Core
J
am
Compute (evm gas/µs) 1.25
500-5,000 0.15-1.5m
State writes (s
1
) 57
n/a n/a
State reads (s
1
) 595
4k
1.4m
Input data (s
1
) 78kb
2.5mb
852mb
What we can see is that
J
am’s overall predicted per-
formance profile implies it could be comparable to many
thousands of that of the basic Ethereum L1 chain. The
large factor here is essentially due to three things: spacial
parallelism, as
J
am can host several hundred cores under
its security apparatus; temporal parallelism, as
J
am tar-
gets continuous execution for its cores and pipelines much
of the computation between blocks to ensure a constant,
optimal workload; and platform optimization by using a
vm and gas model which closely fits modern hardware ar-
chitectures.
It must however be understood that this is a provi-
sional and crude estimation only. It is included for only
the purpose of expressing
J
am’s performance in tangi-
ble terms and is not intended as a means of comparing
to a “full-blown” Ethereum/L2-ecosystem combination.
Specifically, it does not take into account:
that these numbers are based on real performance
of Ethereum and performance modelling of
J
am
(though our models are based on real-world per-
formance of the components);
any L2 scaling which may be possible with either
J
am or Ethereum;
the state partitioning which uses of
J
am
would
imply;
the as-yet unfixed gas model for the pvm;
that pvm/evm comparisons are necessarily impre-
cise;
(
) all figures for Ethereum L1 are drawn from
the same resource: on average each figure will be
only
1
4 of this maximum.
(
) the state reads and input data figures for
J
am
are drawn from the same resource: on average
each figure will be only
1
2 of this maximum.
We leave it as further work for an empirical analysis of
performance and an analysis and comparison between
J
am
and the aggregate of a hypothetical Ethereum ecosystem
which included some maximal amount of L2 deployments
together with full Dank-sharding and any other additional
consensus elements which they would require. This, how-
ever, is out of scope for the present work.
21. Conclusion
We have introduced a novel computation model which
is able to make use of pre-existing crypto-economic mech-
anisms in order to deliver major improvements in scala-
bility without causing persistent state-fragmentation and
thus sacrificing overall cohesion. We call this overall pat-
tern collect-refine-join-accumulate. Furthermore, we have
formally defined the on-chain portion of this logic, essen-
tially the join-accumulate portion. We call this protocol
the
J
am chain.
We argue that the model of
J
am provides a novel “sweet
spot”, allowing for massive amounts of computation to
be done in secure, resilient consensus compared to fully-
synchronous models, and yet still have strict guarantees
about both timing and integration of the computation
into some singleton state machine unlike persistently frag-
mented models.
21.1. Further Work. While we are able to estimate the-
oretical computation possible given some basic assump-
tions and even make broad comparisons to existing sys-
tems, practical numbers are invaluable. We believe the
model warrants further empirical research in order to bet-
ter understand how these theoretical limits translate into
real-world performance. We feel a proper cost analysis
and comparison to pre-existing protocols would also be an
excellent topic for further work.
We can be reasonably confident that the design of
J
am
allows it to host a service under which Polkadot parachains
could be validated, however further prototyping work is
needed to understand the possible throughput which a
pvm-powered metering system could support. We leave
such a report as further work. Likewise, we have also
intentionally omitted details of higher-level protocol ele-
ments including cryptocurrency, coretime sales, staking
and regular smart-contract functionality.
A number of potential alterations to the protocol de-
scribed here are being considered in order to make prac-
tical utilization of the protocol easier. These include:
Synchronous calls between services in accumulate.
Restrictions on the transfer function in order to
allow for substantial parallelism over accumula-
tion.
The possibility of reserving substantial additional
computation capacity during accumulate under
certain conditions.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 32
Introducing Merklization into the Work Package
format in order to obviate the need to have the
whole package downloaded in order to evaluate
its authorization.
The networking protocol is also left intentionally un-
defined at this stage and its description must be done in
a follow-up proposal.
Validator performance is not presently tracked on-
chain. We do expect this to be tracked on-chain in the
final revision of the
J
am protocol, but its specific format
is not yet certain and it is therefore omitted at present.
22. Acknowledgements
Much of this present work is based in large part on the
work of others. The Web3 Foundation research team and
in particular Alistair Stewart and Jeff Burdges are respon-
sible for Elves, the security apparatus of Polkadot which
enables the possibility of in-core computation for
J
am.
The same team is responsible for Sassafras, Grandpa and
Beefy.
Safrole is a mild simplification of Sassafras and was
made under the careful review of Davide Galassi and Al-
istair Stewart.
The original CoreJam rfc was refined under the re-
view of Bastian Köcher and Robert Habermeier and most
of the key elements of that proposal have made their way
into the present work.
The pvm is a formalization of a partially simplified
PolkaVM software prototype, developed by Jan Bujak.
Cyrill Leutwiler contributed to the empirical analysis of
the pvm reported in the present work.
The PolkaJam team and in particular Arkadiy
Paronyan, Emeric Chevalier and Dave Emett have been
instrumental in the design of the lower-level aspects of
the
J
am protocol, especially concerning Merklization and
i/o.
Numerous contributors to the repository since publica-
tion have helped correct errors. Thank you to all.
And, of course, thanks to the awesome Lemon Jelly,
a.k.a. Fred Deakin and Nick Franglen, for three of the
most beautiful albums ever produced, the cover art of the
first of which was inspiration for this paper’s background
art.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 33
Appendix A. Polka Virtual Machine
A.1. Basic Definition. We declare the general pvm function Ψ . We assume a single-step invocation function define Ψ
1
and define the full pvm recursively as a sequence of such mutations up until the single-step mutation results in a halting
condition.
(214) Ψ
(Y, N
R
, N
G
,
N
R
13
, M) ({, , } {
F
}× N
R
{
h}× N
R
, N
R
, Z
G
, N
R
13
, M)
(p, ı, ξ, ω, µ)
Ψ(p, ı
, ξ
, ω
, µ
) if ε =
(, ı
, ξ
, ω
, µ
) if ξ
< 0
(ε, ı
, ξ
, ω
, µ
) otherwise
where (ε, ı
, ξ
, ω
, µ
)= Ψ
1
(c, k, j, ı, ξ, ω, µ)
and p = E(j) E
1
(z) E(c) E
z
(j) E(c) E(k), k= c
If the latter condition cannot be satisfied, then (, ı, ξ, ω, µ) is the result.
The pvm exit reason ε {, , }{
F
,
h}×N
R
may be one of regular halt , panic or out-of-gas , or alternatively a
host-call
h, in which the host-call identifier is associated, or page-fault
F
in which case the address into ram is associated.
A.2. Instructions, Opcodes and Skip-distance. The program blob p is split into a series of octets which make
up the instruction data c and the opcode bitmask k as well as the dynamic jump table, j. The former two imply an
instruction sequence, and by extension a basic-block sequence, itself a sequence of indices of the instructions which follow
a block-termination instruction.
The latter, dynamic jump table, is a sequence of indices into the instruction data blob and is indexed into when
dynamically-computed jumps are taken. It is encoded as a sequence of natural numbers (i.e. non-negative integers) each
encoded with the same length in octets. This length, term z above, is itself encoded prior.
The pvm counts instructions in octet terms (rather than in terms of instructions) and it is thus convenient to define
which octets represent the beginning of an instruction, i.e. the opcode octet, and which do not. This is the purpose of k,
the instruction-opcode bitmask. We assert that the length of the bitmask is equal to the length of the instruction blob.
We define the Skip function skip which provides the number of octets, minus one, to the next instruction’s opcode,
given the index of instruction’s opcode index into c (and by extension k):
(215) skip
N N
i min(24 , j N (k [1, 1, . . . ])
i+1+j
= 1)
The Skip function appends k with a sequence of set bits in order to ensure a well-defined result for the final instruction
skip(c 1).
Given some instruction-index i, its opcode is readily expressed as c
i
and the distance in octets to move forward to the
next instruction is 1 + skip(i). However, each instruction’s “length” (defined as the number of contiguous octets starting
with the opcode which are needed to fully define the instruction’s semantics) is left implicit though limited to being at
most 16.
We define ζ as being equivalent to the instructions c except with an indefinite sequence of zeroes suffixed to ensure that
no out-of-bounds access is possible. This effectively defines any otherwise-undefined arguments to the final instruction
and ensures that a trap will occur if the program counter passes beyond the program code. Formally:
(216)
ζ c
[
0, 0, . . .
]
A.3. Basic Blocks and Termination Instructions. Instructions of the following opcodes are considered basic-block
termination instructions; other than trap & fallthrough, they correspond to instructions which may define the instruction-
counter to be something other than its prior value plus the instruction’s skip amount:
Trap and fallthrough: trap , fallthrough
Jumps: jump , jump_ind
Load-and-Jumps: load_imm_jump , load_imm_jump_ind
Branches: branch_eq , branch_ne , branch_ge_u , branch_ge_s , branch_lt_u , branch_lt_s , branch_eq_imm ,
branch_ne_imm
Immediate branches:
branch_lt_u_imm
,
branch_lt_s_imm
,
branch_le_u_imm
,
branch_le_s_imm
,
branch_ge_u_imm
,
branch_ge_s_imm , branch_gt_u_imm , branch_gt_s_imm
We denote this set, as opcode indices rather than names, as T . We define the instruction opcode indices denoting the
beginning of basic-blocks as ϖ:
(217) ϖ
[
0
]
[
n
+
1
+
skip
(
n
)
n
<
N
c
k
n
= 1 c
n
T ]
A.4. Single-Step State Transition. We must now define the single-step pvm state-transition function Ψ
1
:
(218) Ψ
1
(Y, B, N
R
, N
R
, N
G
, N
R
13
, M) ({, , } {
F
,
h}× N
R
, Z
G
, N
R
13
, M)
(c, k, j, ı, ξ, ω, µ) (ε, ı
, ξ
, ω
, µ
)
We define
ε
together with the posterior values (denoted as prime) of each of the items of the machine state as being
in accordance with the table below.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 34
In general, when transitioning machine state for an instruction a number of conditions hold true and instructions are
defined essentially by their exceptions to these rules. Specifically, the machine does not halt, the instruction counter
increments by one, the gas remaining is reduced by the amount corresponding to the instruction type and ram & registers
are unchanged. Formally:
(219) ε = , ı
= ı + 1 + skip(ı), ξ
= ξ ξ
, ω
= ω, µ
= µ except as indicated
Where ram must be inspected and yet access is not possible, then machine state is unchanged, and the exit reason is
a fault with the lowest address to be read which is inaccessible. More formally, let a be the set of indices in to which µ
must be subscripted in order to calculate the result of Ψ
1
. If a V
µ
then let ε =
F
× min(a V
µ
).
Similarly, where ram must be mutated and yet mutable access is not possible, then machine state is unchanged, and
the exit reason is a fault with the lowest address to be read which is inaccessible. More formally, let a be the set of
indices in to which µ
must be subscripted in order to calculate the result of Ψ
1
. If a V
µ
then let ε =
F
× min(a V
µ
).
We define signed/unsigned transitions for various octet widths:
Z
nN
N
2
8n
Z
2
8n1
...2
8n1
a
a if a < 2
8n1
a 2
8n
otherwise
(220)
Z
1
nN
Z
2
8n1
...2
8n1
N
2
8n
a (2
8n
+ a)mod 2
8n
(221)
B
nN
N
2
8n
B
8n
x y i N
2
8n
y[i]
x
2
i
mod 2
(222)
B
1
nN
B
8n
N
2
8n
x y
iN
2
8n
x
i
2
i
(223)
Immediate arguments are encoded in little-endian format with the most-significant bit being the sign bit. They may
be compactly encoded by eliding more significant octets. Elided octets are assumed to be zero if the msb of the value is
zero, and 255 otherwise. This allows for compact representation of both positive and negative encoded values. We thus
define the signed extension function operating on an input of n octets as X
n
:
X
n{0,1,2,3,4}
N
2
8n
N
2
32
x x +
x
2
8n1
(2
32
2
8n
)
(224)
Any alterations of the program counter stemming from a static jump, call or branch must be to the start of a basic
block or else a panic occurs. Hypotheticals are not considered. Formally:
(225) branch(b, C) Ô (ε, ı
)=
(, ı) if ¬C
(, ı) otherwise if b ϖ
(, b) otherwise
Jumps whose next instruction is dynamically computed must use an address which may be indexed into the jump-
table j. Through a quirk of tooling
18
, we define the dynamic address required by the instructions as the jump table index
incremented by one and then multiplied by our jump alignment factor Z
A
= 2.
As with other irregular alterations to the program counter, target code index must be the start of a basic block or
else a panic occurs. Formally:
(226) djump(a) Ô (ε, ı
)=
(, ı) if a = 2
32
2
16
(, ı) otherwise if a = 0 a > j Z
A
a mod Z
A
0 j
(
a
/Z
A
)1
ϖ
(, j
(
a
/Z
A
)1
) otherwise
A.5. Instruction Tables. Note that in the case that the opcode is not defined in the following tables then the instruction
is considered invalid, and it results in a panic; ε = .
We assume the skip length is well-defined:
(227) skip(ı)
A.5.1. Instructions without Arguments.
ζ
ı
Name ξ
Mutations
0 trap 0 ε =
17 fallthrough 0
18
The popular code generation backend llvm requires and assumes in its code generation that dynamically computed jump destinations
always have a certain memory alignment. Since at present we depend on this for our tooling, we must acquiesce to its assumptions.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 35
A.5.2. Instructions with Arguments of One Immediate.
(228)
let l
X
= min(4, ), ν
X
X
l
X
(E
1
l
X
(ζ
ı+1⋅⋅⋅+l
X
))
ζ
ı
Name ξ
Mutations
78 ecalli 0 ε =
h × ν
X
A.5.3. Instructions with Arguments of Two Immediates.
(229)
let l
X
= min(4, ζ
ı+1
mo d 8), ν
X
X
l
X
(E
1
l
X
(ζ
ı+2⋅⋅⋅+l
X
))
let l
Y
= min(4, max(0, l
X
1 )), ν
Y
X
l
Y
(E
1
l
Y
(ζ
ı+2+l
X
⋅⋅⋅+l
Y
))
ζ
ı
Name ξ
Mutations
62 store_imm_u8 0 µ
ν
X
= ν
Y
mo d 2
8
79 store_imm_u16 0 µ
ν
X
⋅⋅⋅+2
= E
2
(ν
Y
mo d 2
16
)
38 store_imm_u32 0 µ
ν
X
⋅⋅⋅+4
= E
4
(ν
Y
)
A.5.4. Instructions with Arguments of One Offset.
(230)
let l
X
= min(4, ), ν
X
ı + Z
l
X
(E
1
l
X
(ζ
ı+1⋅⋅⋅+l
X
))
ζ
ı
Name ξ
Mutations
5 jump 0 branch(ν
X
, )
A.5.5. Instructions with Arguments of One Register & One Immediate.
(231)
let r
A
= min(12 , ζ
ı+1
mo d 16), ω
A
ω
r
A
, ω
A
ω
r
A
let l
X
= min(4, max(0, 1)), ν
X
X
l
X
(E
1
l
X
(ζ
ı+2⋅⋅⋅+l
X
))
ζ
ı
Name ξ
Mutations
19 jump_ind 0 djump((ω
A
+ ν
X
)mod 2
32
)
4 load_imm 0 ω
A
= ν
X
60 load_u8 0 ω
A
= µ
ν
X
74 load_i8 0 ω
A
= Z
1
4
(Z
1
(µ
ν
X
))
76 load_u16 0 ω
A
= E
1
2
(µ
ν
X
⋅⋅⋅+2
)
66 load_i16 0 ω
A
= Z
1
4
(Z
2
(E
1
2
(µ
ν
X
⋅⋅⋅+2
)))
10 load_u32 0 ω
A
= E
1
4
(µ
ν
X
⋅⋅⋅+4
)
71 store_u8 0 µ
ν
X
= ω
A
mo d 2
8
69 store_u16 0 µ
ν
X
⋅⋅⋅+2
= E
2
(ω
A
mo d 2
16
)
22 store_u32 0 µ
ν
X
⋅⋅⋅+4
= E
4
(ω
A
)
A.5.6. Instructions with Arguments of One Register & Two Immediates.
(232)
let r
A
= min(12 , ζ
ı+1
mo d 16), ω
A
ω
r
A
, ω
A
ω
r
A
let l
X
= min(4,
ζ
ı+1
16
mod 8 ), ν
X
= X
l
X
(E
1
l
X
(ζ
ı
+
2
⋅⋅⋅+
l
X
))
let l
Y
= min(4, max(0, l
X
1 )), ν
Y
= X
l
Y
(E
1
l
Y
(ζ
ı+2+l
X
⋅⋅⋅+l
Y
))
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 36
ζ
ı
Name ξ
Mutations
26 store_imm_ind_u8 0 µ
ω
A
+ν
X
= ν
Y
mo d 2
8
54 store_imm_ind_u16 0 µ
ω
A
+ν
X
⋅⋅⋅+2
= E
2
(ν
Y
mo d 2
16
)
13 store_imm_ind_u32 0 µ
ω
A
+ν
X
⋅⋅⋅+4
= E
4
(ν
Y
)
A.5.7. Instructions with Arguments of One Register, One Immediate and One Offset.
(233)
let r
A
= min(12 , ζ
ı+1
mo d 16), ω
A
ω
r
A
, ω
A
ω
r
A
let l
X
= min(4,
ζ
ı+1
16
mod 8 ), ν
X
= X
l
X
(E
1
l
X
(ζ
ı+2⋅⋅⋅+l
X
))
let l
Y
= min(4, max(0, l
X
1 )), ν
Y
= ı + Z
l
Y
(E
1
l
Y
(ζ
ı+2+l
X
⋅⋅⋅+l
Y
))
ζ
ı
Name ξ
Mutations
6 load_imm_jump 0 branch(ν
Y
, ) , ω
A
= ν
X
7 branch_eq_imm 0 branch(ν
Y
, ω
A
= ν
X
)
15 branch_ne_imm 0 branch(ν
Y
, ω
A
ν
X
)
44 branch_lt_u_imm 0 branch(ν
Y
, ω
A
< ν
X
)
59 branch_le_u_imm 0 branch(ν
Y
, ω
A
ν
X
)
52 branch_ge_u_imm 0 branch(ν
Y
, ω
A
ν
X
)
50 branch_gt_u_imm 0 branch(ν
Y
, ω
A
> ν
X
)
32 branch_lt_s_imm 0 branch(ν
Y
, Z
4
(ω
A
)< Z
4
(ν
X
))
46 branch_le_s_imm 0 branch(ν
Y
, Z
4
(ω
A
) Z
4
(ν
X
))
45 branch_ge_s_imm 0 branch(ν
Y
, Z
4
(ω
A
) Z
4
(ν
X
))
53 branch_gt_s_imm 0 branch(ν
Y
, Z
4
(ω
A
)> Z
4
(ν
X
))
A.5.8. Instructions with Arguments of Two Registers.
(234)
let r
D
= min(12, (ζ
ı+1
)mod 16), ω
D
ω
r
D
, ω
D
ω
r
D
let r
A
= min(12,
ζ
ı+1
16
), ω
A
ω
r
A
, ω
A
ω
r
A
ζ
ı
Name ξ
Mutations
82 move_reg 0 ω
D
= ω
A
87 sbrk 0
ω
D
min(x N
R
)
x h
N
x⋅⋅⋅+ω
A
V
µ
N
x⋅⋅⋅+ω
A
V
µ
Note, the term
h
above refers to the beginning of the heap, the second major segment of memory as defined in equation
245 as 2Z
Q
+ Q(o). If sbrk instruction is invoked on a pvm instance which does not have such a memory layout, then
h = 0.
A.5.9. Instructions with Arguments of Two Registers & One Immediate.
(235)
let r
A
= min(12, (ζ
ı+1
)mod 16), ω
A
ω
r
A
, ω
A
ω
r
A
let r
B
= min(12,
ζ
ı+1
16
), ω
B
ω
r
B
, ω
B
ω
r
B
let l
X
= min(4, max(0, 1)), ν
X
X
l
X
(E
1
l
X
(ζ
ı+2⋅⋅⋅+l
X
))
ζ
ı
Name ξ
Mutations
16 store_ind_u8 0 µ
ω
B
+ν
X
= ω
A
mo d 2
8
29 store_ind_u16 0 µ
ω
B
+ν
X
⋅⋅⋅+2
= E
2
(ω
A
mo d 2
16
)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 37
ζ
ı
Name ξ
Mutations
3 store_ind_u32 0 µ
ω
B
+ν
X
⋅⋅⋅+4
= E
4
(ω
A
)
11 load_ind_u8 0 ω
A
= µ
ω
B
+ν
X
21 load_ind_i8 0 ω
A
= Z
1
4
(Z
1
(µ
ω
B
+ν
X
))
37 load_ind_u16 0 ω
A
= E
1
2
(µ
ω
B
+ν
X
⋅⋅⋅+2
)
33 load_ind_i16 0 ω
A
= Z
1
4
(Z
2
(E
1
2
(µ
ω
B
+ν
X
⋅⋅⋅+2
)))
1 load_ind_u32 0 ω
A
= E
1
4
(µ
ω
B
+ν
X
⋅⋅⋅+4
)
2 add_imm 0 ω
A
= (ω
B
+ ν
X
)mod 2
32
18 and_imm 0 i N
32
B
4
(ω
A
)
i
= B
4
(ω
B
)
i
B
4
(ν
X
)
i
31 xor_imm 0 i N
32
B
4
(ω
A
)
i
= B
4
(ω
B
)
i
B
4
(ν
X
)
i
49 or_imm 0 i N
32
B
4
(ω
A
)
i
= B
4
(ω
B
)
i
B
4
(ν
X
)
i
35 mul_imm 0 ω
A
= (ω
B
ν
X
)mod 2
32
65 mul_upper_s_s_imm 0 ω
A
= Z
1
4
(
(Z
4
(ω
B
) Z
4
(ν
X
))÷ 2
32
)
63 mul_upper_u_u_imm 0 ω
A
=
(ω
B
ν
X
)÷ 2
32
27 set_lt_u_imm 0 ω
A
= ω
B
< ν
X
56 set_lt_s_imm 0 ω
A
= Z
4
(ω
B
)< Z
4
(ν
X
)
9 shlo_l_imm 0 ω
A
= (ω
B
2
ν
X
mod 32
)mod 2
32
14 shlo_r_imm 0 ω
A
=
ω
B
÷ 2
ν
X
mod 32
25 shar_r_imm 0 ω
A
= Z
1
4
(
Z
4
(ω
B
)÷ 2
ν
X
mod 32
)
40 neg_add_imm 0 ω
A
= (ν
X
+ 2
32
ω
B
)mod 2
32
39 set_gt_u_imm 0 ω
A
= ω
B
> ν
X
61 set_gt_s_imm 0 ω
A
= Z
4
(ω
B
)> Z
4
(ν
X
)
75 shlo_l_imm_alt 0 ω
A
= (ν
X
2
ω
B
mod 32
)mod 2
32
72 shlo_r_imm_alt 0 ω
A
=
ν
X
÷ 2
ω
B
mod 32
80 shar_r_imm_alt 0 ω
A
= Z
1
4
(
Z
4
(ν
X
)÷ 2
ω
B
mod 32
)
85 cmov_iz_imm 0 ω
A
=
ν
X
if ω
B
= 0
ω
A
otherwise
86 cmov_nz_imm 0 ω
A
=
ν
X
if ω
B
0
ω
A
otherwise
A.5.10. Instructions with Arguments of Two Registers & One Offset.
(236)
let r
A
= min(12, (ζ
ı+1
)mod 16), ω
A
ω
r
A
, ω
A
ω
r
A
let r
B
= min(12,
ζ
ı+1
16
), ω
B
ω
r
B
, ω
B
ω
r
B
let l
X
= min(4, max(0, 1)), ν
X
ı + Z
l
X
(E
1
l
X
(ζ
ı+2⋅⋅⋅+l
X
))
ζ
ı
Name ξ
Mutations
24 branch_eq 0 branch(ν
X
, ω
A
= ω
B
)
30 branch_ne 0 branch(ν
X
, ω
A
ω
B
)
47 branch_lt_u 0 branch(ν
X
, ω
A
< ω
B
)
48 branch_lt_s 0 branch(ν
X
, Z
4
(ω
A
)< Z
4
(ω
B
))
41 branch_ge_u 0 branch(ν
X
, ω
A
ω
B
)
43 branch_ge_s 0 branch(ν
X
, Z
4
(ω
A
) Z
4
(ω
B
))
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 38
A.5.11. Instruction with Arguments of Two Registers and Two Immediates.
(237)
let r
A
= min(12 , (ζ
ı+1
)mod 16), ω
A
ω
r
A
, ω
A
ω
r
A
let r
B
= min(12 ,
ζ
ı+1
16
), ω
B
ω
r
B
, ω
B
ω
r
B
let l
X
= min(4, ζ
ı+2
mo d 8), ν
X
= X
l
X
(E
1
l
X
(ζ
ı+3⋅⋅⋅+l
X
))
let l
Y
= min(4, max(0, l
X
2 )), ν
Y
= X
l
Y
(E
1
l
Y
(ζ
ı+3+l
X
⋅⋅⋅+l
Y
))
ζ
ı
Name ξ
Mutations
42 load_imm_jump_ind 0 djump((ω
B
+ ν
Y
)mod 2
32
) , ω
A
= ν
X
A.5.12. Instructions with Arguments of Three Registers.
(238)
let r
A
= min(12, (ζ
ı+1
)mod 16), ω
A
ω
r
A
, ω
A
ω
r
A
let r
B
= min(12,
ζ
ı+1
16
), ω
B
ω
r
B
, ω
B
ω
r
B
let r
D
= min(12, ζ
ı+2
), ω
D
ω
r
D
, ω
D
ω
r
D
ζ
ı
Name ξ
Mutations
8 add 0 ω
D
= (ω
A
+ ω
B
)mod 2
32
20 sub 0 ω
D
= (ω
A
+ 2
32
ω
B
)mod 2
32
23 and 0 i N
32
B
4
(ω
D
)
i
= B
4
(ω
A
)
i
B
4
(ω
B
)
i
28 xor 0 i N
32
B
4
(ω
D
)
i
= B
4
(ω
A
)
i
B
4
(ω
B
)
i
12 or 0 i N
32
B
4
(ω
D
)
i
= B
4
(ω
A
)
i
B
4
(ω
B
)
i
34 mul 0 ω
D
= (ω
A
ω
B
)mod 2
32
67 mul_upper_s_s 0 ω
D
= Z
1
4
(
(Z
4
(ω
A
) Z
4
(ω
B
))÷ 2
32
)
57 mul_upper_u_u 0 ω
D
=
(ω
A
ω
B
)÷ 2
32
81 mul_upper_s_u 0 ω
D
= Z
1
4
(
(Z
4
(ω
A
) ω
B
)÷ 2
32
)
68 div_u 0 ω
D
=
2
32
1 if ω
B
= 0
ω
A
÷ ω
B
otherwise
64 div_s 0 ω
D
=
2
32
1 if ω
B
= 0
ω
A
if Z
4
(ω
A
)= 2
31
Z
4
(ω
B
)= 1
Z
1
4
(Z
4
(ω
A
)÷ Z
4
(ω
B
)) otherwise
73 rem_u 0 ω
D
=
ω
A
if ω
B
= 0
ω
A
mo d ω
B
otherwise
70 rem_s 0 ω
D
=
ω
A
if ω
B
= 0
0 if Z
4
(ω
A
)= 2
31
Z
4
(ω
B
)= 1
Z
1
4
(Z
4
(ω
A
)mod Z
4
(ω
B
)) otherwise
36 set_lt_u 0 ω
D
= ω
A
< ω
B
58 set_lt_s 0 ω
D
= Z
4
(ω
A
)< Z
4
(ω
B
)
55 shlo_l 0 ω
D
= (ω
A
2
ω
B
mod 32
)mod 2
32
51 shlo_r 0 ω
D
=
ω
A
÷ 2
ω
B
mod 32
77 shar_r 0 ω
D
= Z
1
4
(
Z
4
(ω
A
)÷ 2
ω
B
mod 32
)
83 cmov_iz 0 ω
D
=
ω
A
if ω
B
= 0
ω
D
otherwise
84 cmov_nz 0 ω
D
=
ω
A
if ω
B
0
ω
D
otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 39
A.6. Host Call Definition. An extended version of the pvm invocation which is able to progress an inner host-call
state-machine in the case of a host-call halt condition is defined as Ψ
H
:
(239)
Ψ
H
(Y, N
R
, N
G
, N
R
13
, M,
X
, X) ({, , } {
F
}× N
R
, Z
G
, N
R
13
, M, X)
(c, ı, ξ, ω, µ, f, x)
(
F
× a, ı
, ξ
, ω
, µ
, x) if
ε =
h × h
F
× a = f (h, ξ
, ω
, µ
, x)
Ψ
H
(c, ı
+ 1 + skip(ı
), ξ
′′
, ω
′′
, µ
′′
, f, x
′′
) if
ε =
h × h
(ξ
′′
, ω
′′
, µ
′′
, x
′′
)= f (h, ξ
, ω
, µ
, x)
(ε, ı
, ξ
, ω
, µ
, x) otherwise
where (ε, ı
, ξ
, ω
, µ
)= Ψ(c, ı, ξ, ω, µ)
On exit, the instruction counter ı
references the instruction which caused the exit. Should the machine be invoked
again using this instruction counter and code, then the same instruction which caused the exit would be executed. This
is sensible when the instruction is one which necessarily needs re-executing such as in the case of an out-of-gas or page
fault reason.
However, when the exit reason to Ψ is a host-call
h, then the resultant instruction-counter has a value of the host-
call instruction and resuming with this state would immediately exit with the same result. Re-invoking would therefore
require both the post-host-call machine state and the instruction counter value for the instruction following the one which
resulted in the host-call exit reason. This is always one greater plus the relevant argument skip distance. Resuming the
machine with this instruction counter will continue beyond the host-call instruction.
We use both values of instruction-counter for the definition of Ψ
H
since if the host-call results in a page fault we need
to allow the outer environment to resolve the fault and re-try the host-call. Conversely, if we successfully transition state
according to the host-call, then on resumption we wish to begin with the instruction directly following the host-call.
A.7. Standard Program Initialization. The software programs which will run in each of the four instances where
the pvm is utilized in the main document have a very typical setup pattern characteristic of an output of a compiler and
linker. This means sections for program-specific read-only data, read-write (heap) data and the stack. An adjunct to
this, very typical of our usage patterns is an extra read-only segment via which invocation-specific data may be passed
(i.e. arguments). It thus makes sense to define this properly in a single initializer function.
We thus define the standard program code format p, which includes not only the instructions and jump table (pre-
viously represented by the term c), but also information on the state of the ram and registers at program start. Given
some p which is appropriately encoded together with some argument data a, we can define program code c, registers ω
and ram µ through the standard initialization decoder function Y :
(240) Y
Y (Y, N
R
13
, M)?
p x
Where:
let E
3
(o) E
3
(w) E
2
(z) E
3
(s) o w E
4
(c) c = p(241)
Z
P
= 2
14
, Z
Q
= 2
16
, Z
I
= 2
24
(242)
let P (x N) Z
P
x
Z
P
, Q(x N) Z
Q
x
Z
Q
(243)
5Z
Q
+ Q(o)+ Q(w+ zZ
P
)+ Q(s)+ Z
I
2
32
(244)
If the above cannot be satisfied, then x = , otherwise x = (c, ω, µ) with c as above and µ, ω such that:
(245) i N
R
µ
i
=
V
o
iZ
Q
, A
R
if Z
Q
i < Z
Q
+ o
(0, R) if Z
Q
+ o i < Z
Q
+ P (o)
(w
i(2Z
Q
+Q(∣o∣))
, W ) if 2Z
Q
+ Q(o) i < 2Z
Q
+ Q(o)+ w
(0, W ) if 2Z
Q
+ Q(o)+ w i < 2Z
Q
+ Q(o)+ P (w)+ zZ
P
(0, W ) if 2
32
2 Z
Q
Z
I
P (s) i < 2
32
2 Z
Q
Z
I
(a
i(2
32
Z
Q
Z
I
)
, R) if 2
32
Z
Q
Z
I
i < 2
32
Z
Q
Z
I
+ a
(
0
, R
)
if
2
32
Z
Q
Z
I
+ a i < 2
32
Z
Q
Z
I
+ P (a)
(0, ) otherwise
(246) i N
16
ω
i
=
2
32
2
16
if i = 1
2
32
2 Z
Q
Z
I
if i = 2
2
32
Z
Q
Z
I
if i = 10
a if i = 11
0 otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 40
A.8. Argument Invocation Definition. The four instances where the pvm is utilized each expect to be able to pass
argument data in and receive some return data back. We thus define the common pvm program-argument invocation
function Ψ
M
:
Ψ
M
(Y, N, N
G
, Y
Z
I
,
X
, X) ((N
G
, Y) {, }, X)
(p, ı, ξ, a, f, x)
(, x) if Y (p)=
R(Ψ
H
(c, ı, ξ, ω, µ, f, x)) if Y (p)= (c, ω, µ)
(247)
where R (ε, ı
, ξ
, ω
, µ
, x)
(, x) if ε =
((ξ
, µ
ω
10
⋅⋅⋅+ω
11
), x) if ε = Z
ω
10
⋅⋅⋅+ω
11
V
µ
((ξ
, []), x) if ε = Z
ω
10
⋅⋅⋅+ω
11
V
µ
(, x) otherwise
(248)
Appendix B. Virtual Machine Invocations
B.1. Host-Call Result Constants.
NONE = 2
32
1 : The return value indicating an item does not exist.
WHAT = 2
32
2 : Name unknown.
OOB = 2
32
3 : The return value for when a memory index is provided for reading/writing which is not accessible.
WHO = 2
32
4 : Index unknown.
FULL = 2
32
5 : Storage full.
CORE = 2
32
6 : Core index unknown.
CASH = 2
32
7 : Insufficient funds.
LOW = 2
32
8 : Gas limit too low.
HIGH = 2
32
9 : Gas limit too high.
HUH = 2
32
10: The item is already solicited or cannot be forgotten.
OK = 0: The return value indicating general success.
Inner pvm invocations have their own set of result codes:
HALT = 0: The invocation completed and halted normally.
PANIC = 1: The invocation completed with a panic.
FAULT = 2: The invocation completed with a page fault.
HOST = 3: The invocation completed with a host-call fault.
Note return codes for a host-call-request exit are any non-zero value less than 2
32
13.
B.2. Is-Authorized Invocation. The Is-Authorized invocation is the first and simplest of the four, being totally
stateless. It provides only a single host-call function,
G
for determining the amount of gas remaining. It accepts as
arguments the work-package as a whole, p and the core on which it should be executed, c. Formally, it is defined as Ψ
I
:
Ψ
I
(P, N
C
) Y J
(p, c)
r otherwise if r {, }
o otherwise if r =
g, o
where (r, )= Ψ
M
(p
c
, 0, G
I
, E(p, c), F, )
(249)
F
(N, N
G
, N
R
6
, M) (Z
G
, N
R
2
, M)
(n, ξ, ω, µ)
G
(ξ, ω, µ) if n = gas
(ξ 10, [WHAT, ω
1
, . . . ], µ) otherwise
(250)
Note for the Is-Authorized host-call dispatch function F in equation 250, we elide the host-call context since, being
essentially stateless, it is always .
B.3. Refine Invocation. We define the Refine service-account invocation function as Ψ
R
. It has no general access to
the state of the
J
am chain, with the slight exception being the ability to make a historical lookup. Beyond this it is able
to create inner instances of the pvm and dictate pieces of data to export.
The historical-lookup host-call function,
H
, is designed to give the same result regardless of the state of the chain for
any time when auditing may occur (which we bound to be less than two epochs from being accumulated). The lookup
anchor may be up to L timeslots before the recent history and therefore adds to the potential age at the time of audit.
We therefore set D = 4, 800, a safe amount of eight hours.
The inner pvm invocation host-calls, meanwhile, depend on an integrated pvm type, which we shall denote M. It
holds some program code, instruction counter and ram:
(251) M
p Y, u M, i N
R
The Export host-call depends on two pieces of context; one sequence of segments (blobs of length W
S
) to which it
may append, and the other an argument passed to the invocation function to dictate the number of segments prior which
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 41
may assumed to have already been appended. The latter value ensures that an accurate segment index can be provided
to the caller.
The Refine invocation function also implicitly draws upon some recent head state δ and explicitly accepts the work
payload, y, together with the service index which is the subject of refinement s, the prediction of the hash of that service’s
code c at the time of reporting, the hash of the containing work-package p, the refinement context c, the authorizer hash
a and its output o, and an export segment offset ς, the import segments and extrinsic data blobs as dictated by the
work-item, i and
x
. It results in either some error
J
or a pair of the refinement output blob and the export sequence.
Formally:
Ψ
R
H, N
G
, N
S
, H, Y, X,
H, Y, G, Y, N
(Y J, Y)
(c, g, s, p, y, c, a, o, i, x, ς)
(BAD, []) if s K(δ) Λ(δ[s], c
t
, c)=
(BIG, []) otherwise if Λ(δ[s], c
t
, c)> S
otherwise
let a = E(s, y, p, c, a, o, [x x <
x]),
and (r, (m, e))= Ψ
M
(Λ(δ[s], c
t
, c), 5, g, a, F, (, []))
(r, []) if r {, }
(u, e) if r =
g, u
(252)
F
(N, N
G
, N
R
6
, M, (DN M, Y)) (N
G
, N
R
2
, M, (DN M, Y))
(n, ξ, ω, µ, (m, e))
H
(ξ, ω, µ, (m, e), s, δ, c
t
) if n = historical_lookup
Y
(ξ, ω, µ, (m, e), i) if n = import
Z
(ξ, ω, µ, (m, e), ς) if n = export
G
(ξ, ω, µ, (m, e)) if n = gas
M
(ξ, ω, µ, (m, e)) if n = machine
P
(ξ, ω, µ, (m, e)) if n = peek
O
(ξ, ω, µ, (m, e)) if n = poke
K
(ξ, ω, µ, (m, e)) if n = invoke
X
(ξ, ω, µ, (m, e)) if n = expunge
(ξ 10, [WHAT, ω
1
, . . . ], µ) otherwise
(253)
B.4. Accumulate Invocation. Since this is a transition which can directly affect a substantial amount of on-chain
state, our invocation context is accordingly complex. It is a tuple with elements for each of the aspects of state which
can be altered through this invocation and beyond the account of the service itself includes the deferred transfer list and
several dictionaries for alterations to preimage lookup state, core assignments, validator key assignments, newly created
accounts and alterations to account privilege levels.
Formally, we define our result context to be X, and our invocation context to be a pair of these contexts, X × X,
with one dimension being the regular dimension and generally named x and the other being the exceptional dimension
and being named y. The only function which actually alters this second dimension is checkpoint,
C
and so it is rarely
seen.
We track both regular and exceptional dimensions within our context mutator, but collapse the result of the invocation
to one or the other depending on whether the termination was regular or exceptional (i.e. out-of-gas or panic).
(254) X
s A?, c
H
Q
C
, v K
V
, i N
S
,
t T, n DN
S
A, p
m N
S
, a N
S
, v N
S
We define Ψ
A
, the Accumulation invocation function as:
Ψ
A
(DN
S
A, N
S
, N
G
, O) X ×
r H?
(δ
, s, g, o)
s
δ
[s], t
[], p
χ, c
φ, v
ι, n
, r
if δ
[s]
c
=
C(Ψ
M
(δ
[s]
c
, 10, g, E(o), F, I(δ
[s], s))) otherwise
(255)
I(a A, s N
S
) (x, y) where
x = y =
s
a, t
[], i, p
χ, c
φ, v
ι, n
,
i = check((E
1
4
(H(s, η
0
, H
t
))mod (2
32
2
9
))+ 2
8
)
(256)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 42
F (n, ξ, ω, µ, (x, y))
G(
R
(ξ, ω, µ, x
s
, s, δ
), (x, y)) if n = read
G(
W
(ξ, ω, µ, x
s
), (x, y)) if n = write
G(
L
(ξ, ω, µ, x
s
, s, δ
), (x, y)) if n = lookup
G(
G
(ξ, ω, µ), (x, y)) if n = gas
G(
I
(ξ, ω, µ, x
s
, s, δ
), (x, y)) if n = info
E
(ξ, ω, µ, (x, y)) if n = empower
A
(ξ, ω, µ, (x, y)) if n = assign
D
(ξ, ω, µ, (x, y)) if n = designate
C
(ξ, ω, µ, (x, y)) if n = checkpoint
N
(ξ, ω, µ, (x, y)) if n = new
U
(ξ, ω, µ, (x, y), s) if n = upgrade
T
(ξ, ω, µ, (x, y), s, δ
) if n = transfer
Q
(ξ, ω, µ, (x, y), s) if n = quit
S
(ξ, ω, µ, (x, y), H
t
) if n = solicit
F
(ξ, ω, µ, (x, y), H
t
) if n = forget
(ξ 10, [WHAT, ω
1
, . . . ], µ, x) otherwise
(257)
G((ξ
, ω
, µ
, s
), (x, y)) (ξ
, ω
, µ
, (x, y)) where x = x
except x
s
= s
(258)
C(o Y {, }, (x X, y X))
x ×
r
o
if o H
x ×
r
if o Y H
y ×
r
if o {, }
(259)
The mutator F governs how this context will alter for any given parameterization, and the collapse function C selects
one of the two dimensions of context depending on whether the virtual machine’s halt was regular or exceptional.
The initializer function I maps some service account s along with its index s to yield a mutator context such that no
alterations to state are implied (beyond those already inherent in s) in either exit scenario. Note that the component a
utilizes the random accumulator η
0
and the block’s timeslot H
t
to create a deterministic sequence of identifiers which
are extremely likely to be unique.
Concretely, we create the identifier from the Blake2 hash of the identifier of the creating service, the current random
accumulator η
0
and the block’s timeslot. Thus, within a service’s accumulation it is almost certainly unique, but it is
not necessarily unique across all services, nor at all times in the past. We utilize a check function to find the first such
index in this sequence which does not already represent a service:
(260) check(i N
S
)
i if i K(δ
)
check((i 2
8
+ 1 )mod (2
32
2
9
)+ 2
8
) otherwise
nb In the highly unlikely event that a block executes to find that a single service index has inadvertently been attached
to two different services, then the block is considered invalid. Since no service can predict the identifier sequence ahead
of time, they cannot intentionally disadvantage the block author.
B.5. On-Transfer Invocation. We define the On-Transfer service-account invocation function as Ψ
T
; it is somewhat
similar to the Accumulation Invocation except that the only state alteration it facilitates are basic alteration to the
storage of the subject account. No further transfers may be made, no privileged operations are possible, no new accounts
may be created nor other operations done on the subject account itself. The function is defined as:
Ψ
T
(DN
S
A, N
S
, T) A
(δ
, s, t)
s if s
c
= t = []
Ψ
M
(s
c
, 15,
rt
(r
g
), E(t), F, s) otherwise
(261)
where s = δ
[s] except s
b
= δ
[s]
b
+
rt
r
a
(262)
F (n, ξ, ω, µ, s)
L
(ξ, ω, µ, s, s, δ
) if n = lookup
R
(
ξ, ω, µ,
s
, s, δ
)
if
n
=
read
W
(ξ, ω, µ, s) if n = write
G
(ξ, ω, µ) if n = gas
I
(ξ, ω, µ, s, s, δ
) if n = info
(ξ 10, [WHAT, ω
1
, . . . ], µ, s) otherwise
(263)
nb When considering the mutator functions
R
and
I
, the final arguments passed are both the post-accumulation
accounts state, δ
. Within the function, this parameter however is denoted simply d. This is intentional and avoids
potential confusion since the functions are also utilized for the Accumulation Invocation where the argument is δ
.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 43
B.6. General Functions. This defines a number of functions broadly of the form (ξ
Z
G
, ω
N
R
2
, µ
, s
)=
(ξ
N
G
, ω N
R
6
, µ M, s A, . . . ). Functions which have a result component which is equivalent to the corresponding
argument may have said components elided in the description. Functions may also depend upon particular additional
parameters.
Unlike the Accumulate functions in appendix B.7, these do not mutate an accumulation context, but merely a service
account s.
The gas function,
G
has a parameter list suffixed with an ellipsis to denote that any additional parameters may be
taken and are provided transparently into its result. This allows it to be easily utilized in multiple pvm invocations.
ξ
ξ g(264)
(ω
, µ
, s
)
(ω, µ, s) if ξ < g
(ω, µ, s) except as indicated below otherwise
(265)
Function
Identifier
Gas usage
Mutations
G
(ξ, ω, . . . )
gas = 0
g = 10
ω
0
ξ
mo d 2
32
ω
1
ξ
÷ 2
32
L
(ξ, ω, µ, s, s, d)
lookup = 1
g = 10
let a =
s if ω
0
{s, 2
32
1 }
d[ω
0
] otherwise
let [h
o
, b
o
, b
z
]= ω
1..4
let h =
H(µ
h
o
⋅⋅⋅+32
) if Z
h
o
⋅⋅⋅+32
V
µ
otherwise
let v =
a
p
[h] if a h K(a
p
)
otherwise
i N
min(b
z
,v∣)
µ
b
o
+i
v
i
if v Z
b
o
⋅⋅⋅+b
z
V
µ
µ
b
o
+i
otherwise
ω
0
NONE if v =
v otherwise
if h Z
b
o
⋅⋅⋅+b
z
V
µ
OOB otherwise
R
(ξ, ω, µ, s, s, d)
read = 2
g = 10
let a =
s if ω
0
{s, 2
32
1 }
d[ω
0
] otherwise if ω
0
K(d)
otherwise
let [k
o
, k
z
, b
o
, b
z
]= ω
1..5
let k =
H(E
4
(s) µ
k
o
⋅⋅⋅+k
z
) if Z
k
o
⋅⋅⋅+k
z
V
µ
otherwise
let v =
a
s
[k] if a k K(a
s
)
otherwise
i N
min(b
z
,v∣)
µ
b
o
+i
v
i
if v Z
b
o
⋅⋅⋅+b
z
V
µ
µ
b
o
+i
otherwise
ω
0
NONE if v =
v otherwise
if k Z
b
o
⋅⋅⋅+b
z
V
µ
OOB otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 44
Function
Identifier
Gas usage
Mutations
W
(ξ, ω, µ, s)
write = 3
g = 10
let [k
o
, k
z
, v
o
, v
z
]= ω
0..4
let k =
H(E
4
(s) µ
k
o
⋅⋅⋅+k
z
) if Z
k
o
⋅⋅⋅+k
z
V
µ
otherwise
let a =
s except
K(a
s
)= K(a
s
) {k} if v
z
= 0
a
s
[k]= µ
v
o
⋅⋅⋅+v
z
otherwise
if Z
v
o
⋅⋅⋅+v
z
V
µ
otherwise
let l =
s
s
[k] if k K(s
s
)
NONE otherwise
(ω
0
, s
)
(l, a) if k a a
t
a
b
(FULL, s) if a
t
> a
b
(OOB, s) otherwise
I
(ξ, ω, µ, s, s, d)
info = 4
g = 10
let t =
s if ω
0
{s, 2
32
1 }
(d x
n
)[ω
0
] otherwise
let o = ω
1
let m =
E(t
c
, t
b
, t
t
, t
g
, t
m
, t
l
, t
i
) if t
otherwise
i N
m
µ
o+i
m
i
if m Z
o⋅⋅⋅+m
V
µ
µ
o+i
otherwise
ω
0
OK if m Z
o⋅⋅⋅+m
V
µ
NONE if m =
OOB otherwise
B.7. Accumulate Functions. This defines a number of functions broadly of the form (ξ
Z
G
, ω
N
R
2
, µ
, (x
, y
))=
(ξ N
G
, ω N
R
6
, µ M, (x X, y X), . . . ). Functions which have a result component which is equivalent to the
corresponding argument may have said components elided in the description. Functions may also depend upon particular
additional parameters.
ξ
ξ g(266)
(ω
, µ
, x
, y
)
(ω, µ, x, y) if ξ < g
(ω, µ, x, y) except as indicated below otherwise
(267)
Function
Identifier
Gas usage
Mutations
E
(ξ, ω, µ, (x, y))
empower = 5
g = 10
let [m, a, v, o, n]= ω
0...5
let g =
{(s g)where E
4
(s) E
8
(g)= d
o+12i⋅⋅⋅+12
i N
n
} if Z
o⋅⋅⋅+12n
V
µ
otherwise
(ω
0
, x
p
)=
(OK,
m, a, v, g
) if g
(OOB, x
p
) otherwise
A
(ξ, ω, µ, (x, y))
assign = 6
g = 10
let o = ω
1
let c =
[µ
o+32i⋅⋅⋅+32
i <
N
Q
] if Z
o⋅⋅⋅+32Q
V
µ
otherwise
(ω
0
, x
)=
(OK, x except x
c
[ω
0
]= c) if ω
0
< C c
(OOB, x) if c =
(CORE, x) otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 45
Function
Identifier
Gas usage
Mutations
D
(ξ, ω, µ, (x, y))
designate = 7
g = 10
let o = ω
0
let v =
[µ
o+336i⋅⋅⋅+336
i <
N
V
] if Z
o⋅⋅⋅+336V
V
µ
otherwise
(ω
0
, x
)=
(OK, x except x
v
= v) if v
(OOB, x) otherwise
C
(ξ, ω, µ, (x, y))
checkpoint = 8
g = 10
y
x
ω
0
ξ
mo d 2
32
ω
1
ξ
÷ 2
32
N
(ξ, ω, µ, (x, y))
new = 9
g = 10
let [o, l, g
l
, g
h
, m
l
, m
h
]= ω
0..6
let c =
µ
o⋅⋅⋅+32
if N
o⋅⋅⋅+32
V
µ
otherwise
let g = 2
32
g
h
+ g
l
let m = 2
32
m
h
+ m
l
let a A {}=
(c, s {}, l {(c, l) []}, b a
t
, g, m) if c
otherwise
let b = (x
s
)
b
a
t
(ω
0
, x
i
, x
n
, (x
s
)
b
)
(x
i
, check(bump(x
i
)), x
n
{x
i
a}, b) if a b (x
s
)
t
(OOB, x
T
) if c =
(CASH, x
T
) otherwise
where bump(i N
S
)= 2
8
+ (i 2
8
+ 42)mod (2
32
2
9
)
U
(ξ, ω, µ, (x, y), s)
upgrade = 10
g = 10
let [o, g
h
, g
l
, m
h
, m
l
]= ω
0..5
let c =
µ
o⋅⋅⋅+32
if N
o⋅⋅⋅+32
V
µ
otherwise
let g = 2
32
g
h
+ g
l
let m = 2
32
m
h
+ m
l
(ω
0
, x
[s]
c
, x
[s]
g
, x
[s]
m
)
(OK, c, g, m) if c
(OOB, x[s]
c
, x[s]
g
, x[s]
m
) otherwise
T
(ξ, ω, µ, (x, y), s, δ)
transfer = 11
g = 10 + ω
1
+ 2
32
ω
2
let (d, a
l
, a
h
, g
l
, g
h
, o)= ω
0..6
,
let a = 2
32
a
h
+ a
l
let g = 2
32
g
h
+ g
l
let t T {}=
(s, d, a, m, g) m = µ
o⋅⋅⋅+M
if N
o⋅⋅⋅+M
V
µ
otherwise
let b = (x
s
)
b
a
(ω
0
, x
t
, (x
s
)
b
)
(OOB, x
t
, (x
s
)
b
) if t =
(WHO, x
t
, (x
s
)
b
) otherwise if d K(δ x
n
)
(LOW, x
t
, (x
s
)
b
) otherwise if g < (δ x
n
)[d]
m
(HIGH, x
t
, (x
s
)
b
) otherwise if ξ < g
(CASH, x
t
, (x
s
)
b
) otherwise if b < (x
s
)
t
(OK, x
t
t
, b
)
otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 46
Function
Identifier
Gas usage
Mutations
Q
(ξ, ω, µ, (x, y), s)
quit = 12
g = 10 + ω
1
+ 2
32
ω
2
let [d, o]= ω
0,1
let a = (x
s
)
b
(x
s
)
t
+ B
S
let g = ξ
let t T {, }=
if d {s, 2
32
1 }
(s, d, a, m, g) m = E
1
(µ
o⋅⋅⋅+M
) otherwise if N
o⋅⋅⋅+M
V
µ
otherwise
(ω
0
, x
s
, x
t
)
(OK, , x
t
), virtual machine halts if t =
(OOB, x
t
, (x
s
)
b
) otherwise if t =
(WHO, x
t
, (x
s
)
b
) otherwise if d K(δ x
n
)
(LOW, x
t
, (x
s
)
b
) otherwise if g < (δ x
n
)[d]
m
(OK, , x
t
t), virtual machine halts otherwise
S
(ξ, ω, µ, (x, y))
solicit = 13
g = 10
let [o, z]= ω
0,1
let h =
µ
o⋅⋅⋅+32
if Z
o⋅⋅⋅+32
V
µ
otherwise
let a =
x
s
except:
a
l
[
h, z
]= [] if h (h, z) (x
s
)
l
a
l
[
h, z
]= (x
s
)
l
[
h, z
] t if (x
s
)
l
[
h, z
]= [x, y]
otherwise
(ω
0
, x
s
)
(OOB, x
s
) if h =
(HUH, x
s
) otherwise if a =
(FULL, x
s
) otherwise if a
b
< a
t
(OK, a) otherwise
F
(ξ, ω, µ, (x, y), t)
forget = 14
g = 10
let [o, z]= ω
0,1
let h =
µ
o⋅⋅⋅+32
if Z
o⋅⋅⋅+32
V
µ
otherwise
let a =
x
s
except:
K(a
l
)= K((x
s
)
l
) {
h, z
} ,
K(a
p
)= K((x
s
)
p
) {h}
if (x
s
)
l
[h, z] {[], [x, y]}, y < t D
a
l
[h, z]= (x
s
)
l
[h, z] t if (x
s
)
l
[h, z]= 1
a
l
[h, z]= [(x
s
)
l
[h, z]
2
, t] if (x
s
)
l
[h, z]= [x, y, w], y < t D
otherwise
(ω
0
, x
s
)
(OOB, x
s
) if h =
(HUH, x
s
) otherwise if a =
(OK, a) otherwise
B.8. Refine Functions. These assume some refine context pair (m, e) (DN M, Y
W
S
), which are both initially
empty.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 47
Function
Identifier
Gas usage
Mutations
H
(ξ, ω, µ, (m, e), s, δ, t)
historical_lookup = 15
g = 10
let a =
δ[s] if ω
0
= 2
32
1 s K(δ)
δ[ω
0
] if ω
0
K(δ)
otherwise
let [h
o
, b
o
, b
z
]= ω
1..4
let h =
H(µ
h
o
⋅⋅⋅+32
) if Z
h
o
⋅⋅⋅+32
V
µ
otherwise
let v = Λ(a, t, h)
i N
min(b
z
,v∣)
µ
b
o
+i
v
i
if v Z
b
o
⋅⋅⋅+b
z
V
µ
µ
b
o
+i
otherwise
ω
0
v if v
NONE otherwise
if k Z
b
o
⋅⋅⋅+b
z
V
µ
OOB otherwise
Y
(ξ, ω, µ, (m, e), i)
import = 16
g = 10
let v =
i
ω
0
if ω
0
< i
otherwise
let o = ω
1
let l = min(ω
2
, W
C
W
S
)
µ
o⋅⋅⋅+l
v if v N
o⋅⋅⋅+l
V
µ
µ
o⋅⋅⋅+l
otherwise
ω
0
OOB if Z
o⋅⋅⋅+l
V
µ
NONE otherwise if v =
OK otherwise
Z
(ξ, ω, µ, (m, e), ς)
export = 17
g = 10
let p = ω
0
let z = min(ω
1
, W
C
W
S
)
let x =
P
W
C
W
S
(µ
p⋅⋅⋅+z
) if N
p⋅⋅⋅+z
V
µ
otherwise
(ω
0
, e
)
(OOB, e) if x =
(FULL, e) otherwise if ς + e W
X
(ς + e, e x) otherwise
M
(ξ, ω, µ, (m, e))
machine = 18
g = 10
let [p
o
, p
z
, i]= ω
0...3
let p =
µ
p
o
⋅⋅⋅+p
z
if Z
p
o
⋅⋅⋅+p
z
V
µ
otherwise
let n = min(n N, n K(m))
let u =
V
[0, 0, . . . ], A
[, , . . . ]
(ω
0
, m)
(OOB, m) if p =
(n, m {n
p, u, i
}) otherwise
P
(ξ, ω, µ, (m, e))
peek = 19
g = 10
let [n, a, b, l]= ω
0...4
let s =
if n K(m)
if N
b⋅⋅⋅+i
V
m[n]
u
m[n]
u
b⋅⋅⋅+i
otherwise
(ω
0
, µ
)
(OOB, µ) if s =
(WHO, µ) if s =
(OK, µ
) where µ
= µ except µ
a⋅⋅⋅+l
= s otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 48
Function
Identifier
Gas usage
Mutations
O
(ξ, ω, µ, (m, e))
poke = 20
g = 10
let [n, a, b, l]= ω
0...4
let u =
m[n]
u
if n K(m)
otherwise
let s =
µ
a⋅⋅⋅+i
if N
a⋅⋅⋅+i
V
u
otherwise
let u
= u except
(u
V
)
b⋅⋅⋅+l
= s
(u
A
)
b⋅⋅⋅+l
= [W, W, ...]
(ω
0
, m
)
(OOB, m) if s =
(WHO, m) otherwise if u =
(OK, m
), where m
= m except m
[n]
u
= u
otherwise
K
(ξ, ω, µ, (m, e))
invoke = 21
g = 10
let [n, o]= ω
0...2
let (g, w)=
(E
1
8
(µ
o⋅⋅⋅+8
), [E
1
4
(µ
o+8+4x⋅⋅⋅+4
)x <
N
13
]) if N
o⋅⋅⋅+60
V
µ
(, ) otherwise
let (c, i
, g
, w
, u
)= Ψ(m[n]
p
, m[n]
i
, g, w, m[n]
u
)
let µ
= µ except µ
o⋅⋅⋅+60
= E
8
(g
) E ([E
4
(x)x <
w
])
let m
= m except
m
[n]
u
= u
m
[n]
i
=
i
+ 1 if c {
h}× N
R
i
otherwise
(ω
0
, ω
1
, µ
, m
)
(OOB, ω
1
, µ, m) if g =
(WHO, ω
1
, µ, m) otherwise if n m
(HOST, h, µ
, m
) otherwise if c =
h × h
(FAULT, x, µ
, m
) otherwise if c =
F
× x
(PANIC, ω
1
, µ
, m
) otherwise if c =
(HALT, ω
1
, µ
, m
) otherwise if c =
X
(ξ, ω, µ, (m, e))
expunge = 22
g = 10
let n = ω
0
(ω
0
, m
)
(WHO, m) if n K(m)
(m[n]
i
, m n) otherwise
Appendix C. Serialization Codec
C.1. Common Terms. Our codec function E is used to serialize some term into a sequence of octets. We define the
deserialization function E
1
= E
1
as the inverse of E and able to decode some sequence into the original value. The
codec is designed such that exactly one value is encoded into any given sequence of octets, and in cases where this is not
desirable then we use special codec functions.
C.1.1. Trivial Encodings. We define the serialization of as the empty sequence:
(268) E() []
We also define the serialization of an octet-sequence as itself:
(269) E(x Y) x
We define anonymous tuples to be encoded as the concatenation of their encoded elements:
(270) E(
a, b, . . .
) E (a) E(b) . . .
Passing multiple arguments to the serialization functions is equivalent to passing a tuple of those arguments. Formally:
E(a, b, c, . . . ) E(
a, b, c, . . .
)(271)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 49
C.1.2. Integer Encoding. We first define the trivial natural number serialization functions which are subscripted by the
number of octets of the final sequence. Values are encoded in a regular little-endian fashion. This is utilized for almost
all integer encoding across the protocol. Formally:
(272) E
lN
N
2
8l
Y
l
x
[] if l = 0
[x mod 256] E
l1

x
256

otherwise
We define general natural number serialization, able to encode naturals of up to 2
64
, as:
(273) E
N
2
64
Y
19
x
[0] if x = 0
2
8
2
8l
+
x
2
8l

E
l
(x mod 2
8l
) if l N
8
2
7l
x < 2
7(l+1)
[2
8
1 ] E
8
(x) otherwise if x < 2
64
Note that at present this is utilized only in encoding the length prefix of variable-length sequences.
C.1.3. Sequence Encoding. We define the sequence serialization function E(T )for any T which is itself a subset of the
domain of E. We simply concatenate the serializations of each element in the sequence in turn:
(274) E([i
0
, i
1
, ...]) E(i
0
) E (i
1
) . . .
Thus, conveniently, fixed length octet sequences (e.g. hashes H and its variants) have an identity serialization.
C.1.4. Discriminator Encoding. When we have sets of heterogeneous items such as a union of different kinds of tuples
or sequences of different length, we require a discriminator to determine the nature of the encoded item for successful
deserialization. Discriminators are encoded as a natural and are encoded immediately prior to the item.
We generally use a length discriminator which serializing sequence terms which have variable length (e.g. general
blobs Y or unbound numeric sequences N) (though this is omitted in the case of fixed-length terms such as hashes
H).
19
In this case, we simply prefix the term its length prior to encoding. Thus, for some term y
x Y, . . .
, we would
generally define its serialized form to be E(x) E(x) . . . . To avoid repetition of the term in such cases, we define the
notation x to mean that the term of value x is variable in size and requires a length discriminator. Formally:
(275) x
x, x
thus E(x) E(x) E(x)
We also define a convenient discriminator operator ¿x specifically for terms defined by some serializable set in union
with (generally denoted for some set S as S?):
¿x
0 if x =
(1, x) otherwise
(276)
C.1.5. Bit Sequence Encoding. A sequence of bits b B is a special case since encoding each individual bit as an octet
would be very wasteful. We instead pack the bits into octets in order of least significant to most, and arrange into an
octet stream. In the case of a variable length sequence, then the length is prefixed as in the general case.
E(b B)
[] if b = []
min(8,b∣)
i=0
b
i
2
i
E(b
8...
) otherwise
(277)
C.1.6. Dictionary Encoding. In general, dictionaries are placed in the Merkle trie directly (see appendix E for details).
However, small dictionaries may reasonably be encoded as a sequence of pairs ordered by the key. Formally:
(278) K, V E(d DK V ) E([k
E(k), E (d[k])
k K(d)])
C.2. Block Serialization. A block B is serialized as a tuple of its elements in regular order, as implied in equations
13, 14 and 37. For the header, we define both the regular serialization and the unsigned serialization E
U
. Formally:
E(B)= E
H, E
T
, [(r, E
4
(a), [(v, E
2
(i), s)(v, i, s)<
j])(r, a, j)<
v], c, f ,
[
s, p
s, p
<
E
P
], [
a, f, E
2
(v), s
a, f, v, s
<
E
A
],
[
w, E
4
(t), a
w, t, a
<
E
G
]
(279)
where E
D
(v, c, f )
E(H)= E
U
(H) E(H
s
)(280)
E
U
(H)= E(H
p
, H
r
, H
x
) E
4
(H
t
) E (¿H
e
, ¿H
w
, H
o
, E
2
(H
i
), H
v
)(281)
E(x X) E(x
a
, x
s
, x
b
, x
l
) E
4
(x
t
) E(¿x
p
)(282)
E(x S) E(x
h
) E
4
(x
l
) E (x
u
, x
e
)(283)
19
Note that since specific values may belong to both sets which would need a discriminator and those that would not then we are sadly
unable to introduce a function capable of serializing corresponding to the term’s limitation. A more sophisticated formalism than basic
set-theory would be needed, capable of taking into account not simply the value but the term from which or to which it belongs in order
to do this succinctly.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 50
E(x L) E
4
(x
s
) E(x
c
, x
l
) E
8
(x
g
) E (O(x
o
))(284)
E(x W) E(x
s
, x
x
, x
c
, x
a
, x
o
, x
r
)(285)
E(x P) E(x
j
, E
4
(x
h
), x
u
, x
p
, x
x
, x
w
)(286)
E(x I) E(E
4
(x
s
), x
c
, x
y
, E
8
(x
g
), [(h, E
2
(i))(h, i)<
x
i
], [(h, E
4
(i))(h, i)<
x
x
], E
2
(x
e
))(287)
E(x C) E(x
y
, x
r
)(288)
O(o J Y)
(0, o) if o Y
1 if o =
2 if o =
3 if o = BAD
4 if o = BIG
(289)
Note the use of O above to succinctly encode the result of a work item and the slight transformations of E
G
and
E
P
to take account of the fact their inner tuples contain variable-length sequence terms a and p which need length
discriminators.
Appendix D. State Merklization
The Merklization process defines a cryptographic commitment from which arbitrary information within state may be
provided as being authentic in a concise and swift fashion. We describe this in two stages; the first defines a mapping
from 32-octet sequences to (unlimited) octet sequences in a process called state serialization. The second forms a 32-octet
commitment from this mapping in a process called Merklization.
D.1. Serialization. The serialization of state primarily involves placing all the various components of σ into a single
mapping from 32-octet sequence state-keys to octet sequences of indefinite length. The state-key is constructed from a
hash component and a chapter component, equivalent to either the index of a state component or, in the case of the
inner dictionaries of δ, a service index.
We define the state-key constructor functions C as:
(290) C
N
2
8
(N
2
8
, N
S
)
N
S
, Y
H
i N
2
8
[i, 0, 0 , . . . ]
(i, s N
S
) [i, n
0
, n
1
, n
2
, n
3
, 0, 0, . . . ] where n = E
4
(s)
(s, h) [n
0
, h
0
, n
1
, h
1
, n
2
, h
2
, n
3
, h
3
, h
4
, h
5
, . . . , h
27
] where n = E
4
(s)
The state serialization is then defined as the dictionary built from the amalgamation of each of the components.
Cryptographic hashing ensures that there will be no duplicate state-keys given that there are no duplicate inputs to C.
Formally, we define T which transforms some state σ into its serialized form:
(291)
T (σ)
C(1) E([x x <
α]),
C
(
2
)
E
(
φ
)
,
C(3) E([(h, E
M
(b), s, p)(h, b, s, p)<
β]),
C(4) E
γ
k
, γ
z
,
0 if γ
s
C
E
1 if γ
s
H
B
E
, γ
s
, γ
a
,
C(5) E([x
x ψ
g
], [x
x ψ
b
], [x
x ψ
w
], [x
x ψ
o
]),
C(6) E(η),
C(7) E(ι),
C(8) E(κ),
C(9) E(λ),
C(10) E([¿(w, E
4
(t))(w, t)<
ρ]),
C(11) E
4
(τ),
C(12) E
4
(χ
m
, χ
a
, χ
v
) E(χ
g
),
C(13) E
4
(π),
(s a) δ C(255, s) a
c
E
8
(a
b
, a
g
, a
m
, a
l
) E
4
(a
i
),
(s a) δ, (h v) a
s
C(s, h) v ,
(s a) δ, (h p) a
p
C(s, h) p ,
(s a) δ, (
h, l
t) a
l
C(s, E
4
(l) (¬h
4...
)) E([E
4
(x)x <
t])
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 51
Note that most rows describe a single mapping between key derived from a natural and the serialization of a state
component. However, the final four rows each define sets of mappings since these items act over all service accounts and
in the case of the final three rows, the keys of a nested dictionary with the service.
Also note that all non-discriminator numeric serialization in state is done in fixed-length according to the size of the
term.
D.2. Merklization. With T defined, we now define the rest of M
σ
which primarily involves transforming the serialized
mapping into a cryptographic commitment. We define this commitment as the root of the binary Patricia Merkle Trie
with a format optimized for modern compute hardware, primarily by optimizing sizes to fit succinctly into typical memory
layouts and reducing the need for unpredictable branching.
D.2.1. Node Encoding and Trie Identification. We identify (sub-)tries as the hash of their root node, with one exception:
empty (sub-)tries are identified as the zero-hash, H
0
.
Nodes are fixed in size at 512 bit (64 bytes). Each node is either a branch or a leaf. The first bit discriminate between
these two types.
In the case of a branch, the remaining 511 bits are split between the two child node hashes, using the last 255 bits of
the 0-bit (left) sub-trie identity and the full 256 bits of the 1-bit (right) sub-trie identity.
Leaf nodes are further subdivided into embedded-value leaves and regular leaves. The second bit of the node discrim-
inates between these.
In the case of an embedded-value leaf, the remaining 6 bits of the first byte are used to store the embedded value size.
The following 31 bytes are dedicated to the first 31 bytes of the key. The last 32 bytes are defined as the value, filling
with zeroes if its length is less than 32 bytes.
In the case of a regular leaf, the remaining 6 bits of the first byte are zeroed. The following 31 bytes store the first 31
bytes of the key. The last 32 bytes store the hash of the value.
Formally, we define the encoding functions B and L:
B
(H, H) B
512
(l, r) [0] bits(l)
1...
bits(r)
(292)
L
(H, Y) B
512
(k, v)
[1, 0] bits(E
1
(v))
...6
bits(k)
...248
bits(v) [0, 0, . . . ] if v 32
[1, 1, 0, 0, 0, 0 , 0, 0] bits(k)
...248
bits(H(v)) otherwise
(293)
We may then define the basic Merklization function M
σ
as:
M
σ
(σ) M({(bits(k)
k, v
)(k v) T (σ)})(294)
M(d DB (H, Y))
H
0
if d= 0
H(bits
1
(L(k, v))) if V(d)= {(k, v)}
H
(
bits
1
(
B
(
M
(
l
)
, M
(
r
))))
otherwise
,
where
b, p
(
b
p
)
d
(
b
1...
p)
l if b
0
= 0
r if b
0
= 1
(295)
Appendix E. General Merklization
E.1. Binary Merkle Trees. The Merkle tree is a cryptographic data structure yielding a hash commitment to a specific
sequence of values. It provides O(N )computation and O(log(N ))proof size for inclusion. This well-balanced formulation
ensures that the maximum depth of any leaf is minimal and that the number of leaves at that depth is also minimal.
The underlying function for our Merkle trees is the node function N, which accepts some sequence of blobs of some
length n and provides either such a blob back or a hash:
(296) N
(Y
n
, Y H) Y
n
H
(v, H)
H
0
if v= 0
v
0
if v= 1
H($node N (v
...
v
/2
, H) N (v
v
/2 ...
, H)) otherwise
The astute reader will realize that if our Y
n
happens to be equivalent H then this function will always evaluate into H.
That said, for it to be secure care must be taken to ensure there is no possibility of preimage collision. For this purpose
we include the hash prefix $node to minimize the chance of this; simply ensure any items are hashed with a different
prefix and the system can be considered secure.
We also define the trace function T , which returns each opposite node from top to bottom as the tree is navigated to
arrive at some leaf corresponding to the item of a given index into the sequence. It is useful in creating justifications of
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 52
data inclusion.
(297)
T
(Y
n
, N
v
, Y H) Y
n
H
(v, i, H)
[N(P
(v, i), H)] T (P
(v, i), i P
I
(v, i), H) if v> 1
[] otherwise
where
P
s
(v, i)
v
...
v
/2
if (i <
v
2)= s
v
v
/2 ...
otherwise
P
I
(v, i)
0 if i <
v
2
v
2 otherwise
From this we define our other Merklization functions.
E.1.1. Well-Balanced Tree. We define the well-balanced binary Merkle function as M
B
:
(298) M
B
(Y, Y H) H
(v, H)
H(v
0
) if v= 1
N(v, H) otherwise
This is suitable for creating proofs on data which is not much greater than 32 octets in length since it avoids hashing
each item in the sequence. For sequences with larger data items, it is better to hash them beforehand to ensure proof-size
is minimal since each proof will generally contain a data item.
Note: In the case that no hash function argument H is supplied, we may assume the Blake 2b hash function, H.
E.1.2. Constant-Depth Tree. We define the constant-depth binary Merkle function as M and the corresponding justi-
fication generation function as J , with the latter having an optional subscript x, which limits the justification to only
those nodes required to justify inclusion of a well-aligned subtree of (maximum) size 2
x
:
M
(Y, Y H) H
(v, H) N (C(v, H), H)
(299)
J
(Y, N
v
, Y H) H
(v, i, H) T (C(v, H), i, H)
(300)
J
x
(Y, N
v
, Y H) H
(v, i, H) T (C(v, H), i, H)
... max(0,log
2
(max(1,v∣))x⌉)
(301)
For the latter justification to be acceptable, we must assume the target observer knows not merely the value of the
item at the given index, but also all other items within its 2
x
size subtree.
As above, we may assume a default value for H of the Blake 2b hash function, H.
In all cases, a constancy preprocessor function C is applied which hashes all data items with a fixed prefix and then
pads them to the next power of two with the zero hash H
0
:
(302) C
(Y, Y H) H
(v, H) v
where
v
= 2
log
2
(max(1,v∣))⌉
v
i
=
H($leaf v
i
) if i < v
H
0
otherwise
E.2. Merkle Mountain Ranges. The Merkle mountain range (mmr) is an append-only cryptographic data structure
which yields a commitment to a sequence of values. Appending to an mmr and proof of inclusion of some item within
it are both O(log(N )) in time and space for the size of the set.
We define a Merkle mountain range as being within the set H?, a sequence of peaks, each peak the root of a
Merkle tree containing 2
i
items where i is the index in the sequence. Since we support set sizes which are not always
powers-of-two-minus-one, some peaks may be empty, rather than a Merkle root.
Since the sequence of hashes is somewhat unwieldy as a commitment, Merkle mountain ranges are themselves generally
hashed before being published. Hashing them removes the possibility of further appending so the range itself is kept on
the system which needs to generate future proofs.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 53
We define the append function A as:
(303)
A
(H?, H, Y H) H?
(r, l, H) P (r, l, 0, H)
where P
(H?, H, N, Y H) H?
(r, l, n, H)
r l if n r
R(r, n, l) if n < r r
n
=
P (R(r, n, ), H(r
n
l), n + 1, H) otherwise
and R
(T , N, T ) T
(s, i, v) s
where s
= s except s
i
=
v
We define the mmr encoding function as E
M
:
(304) E
M
H? Y
b E ([¿x x <
b])
Appendix F. Shuffling
The Fisher-Yates shuffle function is defined formally as:
(305) T, l N F
(T
l
, N
l
) T
l
(s, r)
[s
r
0
mod l
] F (s
...l1
, r
1...
) where s
= s except s
r
0
mod l
= s
l1
if s []
[] otherwise
Since it is often useful to shuffle a sequence based on some random seed in the form of a hash, we provide a secondary
form of the shuffle function F which accepts a 32-byte hash instead of the numeric sequence. We define Q, the numeric-
sequence-from-hash function, thus:
l N Q
l
H N
l
h [E
1
4
(H(h E
4
(
i
8))
4i mod 32⋅⋅⋅+4
)i <
N
l
]
(306)
T, l N F
(T
l
, H) T
l
(s, h) F (s, Q
l
(h))
(307)
Appendix G. Bandersnatch Ring VRF
The Bandersnatch curve is defined by Masson, Sanso, and Zhang 2021.
The singly-contextualized Bandersnatch Schnorr-like signatures F
m
k
care defined as a formulation under the ietf vrf
template specified by Hosseini and Galassi 2024 (as IETF VRF) and further detailed by Goldberg et al. 2023.
F
mY
kH
B
c H Y
96
{x x Y
96
, verify(k, c, m, decode(x
...32
), decode(x
32...
))= }(308)
Y(s F
m
k
c) H hashed_output(decode(x
...32
)x F
m
k
c)(309)
The singly-contextualized Bandersnatch Ringvrf proofs
¯
F
m
r
care a zk-snark-enabled analogue utilizing the Pedersen
vrf, also defined by Hosseini and Galassi 2024 and further detailed by Jeffrey Burdges et al. 2023.
O(H
B
) Y
R
KZG_commitment(H
B
)(310)
¯
F
mY
rY
R
c H Y
784
{x x Y
784
, verify(r, c, m, decode(x
...32
), decode(x
32...
))= }(311)
Y(p
¯
F
m
r
c) H hashed_output(decode(x
...32
)x
¯
F
m
r
c)(312)
Note that in the case a key H
B
has no corresponding Bandersnatch point when constructing the ring, then the
Bandersnatch padding point as stated by Hosseini and Galassi 2024 should be substituted.
Appendix H. Erasure Coding
The foundation of the data-availability and distribution system of
J
am is a systematic Reed-Solomon erasure coding
function in gf(16) of rate 342:1023, the same transform as done by the algorithm of Lin, Chung, and Han 2014. We use
a little-endian Y
2
form of the 16-bit gf points with a functional equivalence given by E
2
. From this we may assume the
encoding function C Y
2
342
Y
2
1023
and the recovery function R
Y
2
, N
1023
342
Y
2
342
. Encoding is done
by extrapolating a data blob of size 684 octets (provided in C here as 342 octet pairs) into 1,023 octet pairs. Recovery
is done by collecting together any distinct 342 octet pairs, together with their indices, and transforming this into the
original sequence of 342 octet pairs.
Practically speaking, this allows for the efficient encoding and recovery of data whose size is a multiple of 684 octets.
Data whose length is not divisible by 684 must be padded (we pad with zeroes). We use this erasure-coding in two
contexts within the
J
am protocol; one where we encode variable sized (but typically very large) data blobs for the Audit
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 54
DA and block-distribution system, and the other where we encode much smaller fixed-size data segments for the Import
DA system.
For the Import DA system, we deal with an input size of 4,104 octets resulting in data-parallelism of order six. We
may attain a greater degree of data parallelism if encoding or recovering more than one segment at a time though for
recovery, we may be restricted to requiring each segment to be formed from the same set of indices (depending on the
specific algorithm).
H.1. Blob Encoding and Recovery. We assume some data blob d Y
684k
, k N. We are able to express this as a whole
number of k pieces each of a sequence of 684 octets. We denote these (data-parallel) pieces p Y
684
= unzip
684
(p).
Each piece is then reformed as 342 octet pairs and erasure-coded using C as above to give 1,023 octet pairs per piece.
The resulting matrix is grouped by its pair-index and concatenated to form 1,023 chunks, each of k octet-pairs. Any
342 of these chunks may then be used to reconstruct the original data d.
Formally we begin by defining four utility functions for splitting some large sequence into a number of equal-sized
sub-sequences and for reconstituting such subsequences back into a single large sequence:
n N, k N split
n
(d Y
k
n
) Y
n
k
d
0⋅⋅⋅+n
, d
n⋅⋅⋅+n
, , d
(k1)n⋅⋅⋅+n
(313)
n
N
, k
N
join
n
(
c
Y
n
k
) Y
kn
c
0
c
1
. . .
(314)
n N, k N unzip
n
(d Y
kn
) Y
n
k
[[d
j.k+i
j <
N
n
]i <
N
k
](315)
n N, k N lace
n
(c Y
n
k
) Y
kn
d where i N
k
, j N
n
d
j.k+i
= (c
i
)
j
(316)
We define the transposition operator hence:
(317)
T
[[x
0,0
, x
0,1
, x
0,2
, . . . ], [x
1,0
, x
1,1
, . . . ], . . . ] [[x
0,0
, x
1,0
, x
2,0
, . . . ], [x
0,1
, x
1,1
, . . . ], . . . ]
We may then define our erasure-code chunking function which accepts an arbitrary sized data blob whose length
divides wholly into 684 octets and results in 1,023 sequences of sequences each of smaller blobs:
(318) C
kN
Y
684k
Y
2k
1023
d [join(c)c <
T
[C(p)p <
unzip
684
(
d
)]]
The original data may be reconstructed with any 342 of the 1,023 resultant items (along with their indices). If the
original 342 items are known then reconstruction is just their concatenation.
(319) R
kN
{(Y
2k
, N
1023
)}
342
Y
684k
c
E([x (x, i)<
c]) if [i (x, i)<
c]= [0, 1, . . . 341]
lace
k
([R([(split
2
(x)
p
, i)(x, i)<
c])p N
k
always
])
Segment encoding/decoding may be done using the same functions albeit with a constant k = 6.
H.2. Code Word representation. For the sake of brevity we call each octet pair a word. The code words (including
the message words) are treated as element of F
2
16
finite field. The field is generated as an extension of F
2
using the
irreducible polynomial:
(320)
x
16
+ x
5
+ x
3
+ x
2
+ 1
Hence:
(321) F
16
F
2
[x]
x
16
+ x
5
+ x
3
+ x
2
+ 1
We name the generator of
F
16
F
2
, the root of the above polynomial, α as such: F
16
= F
2
(α).
Instead of using the standard basis {1, α, α
2
, . . . , α
15
}, we opt for a representation of F
16
which performs more efficiently
for the encoding and the decoding process. To that aim, we name this specific representation of F
16
as
˜
F
16
and define it
as a vector space generated by the following Cantor basis:
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 55
v
0
1
v
1
α
15
+ α
13
+ α
11
+ α
10
+ α
7
+ α
6
+ α
3
+ α
v
2
α
13
+ α
12
+ α
11
+ α
10
+ α
3
+ α
2
+ α
v
3
α
12
+ α
10
+ α
9
+ α
5
+ α
4
+ α
3
+ α
2
+ α
v
4
α
15
+ α
14
+ α
10
+ α
8
+ α
7
+ α
v
5
α
15
+ α
14
+ α
13
+ α
11
+ α
10
+ α
8
+ α
5
+ α
3
+ α
2
+ α
v
6
α
15
+ α
12
+ α
8
+ α
6
+ α
3
+ α
2
v
7
α
14
+ α
4
+ α
v
8
α
14
+ α
13
+ α
11
+ α
10
+ α
7
+ α
4
+ α
3
v
9
α
12
+ α
7
+ α
6
+ α
4
+ α
3
v
10
α
14
+ α
13
+ α
11
+ α
9
+ α
6
+ α
5
+ α
4
+ α
v
11
α
15
+ α
13
+ α
12
+ α
11
+ α
8
v
12
α
15
+ α
14
+ α
13
+ α
12
+ α
11
+ α
10
+ α
8
+ α
7
+ α
5
+ α
4
+ α
3
v
13
α
15
+ α
14
+ α
13
+ α
12
+ α
11
+ α
9
+ α
8
+ α
5
+ α
4
+ α
2
v
14
α
15
+ α
14
+ α
13
+ α
12
+ α
11
+ α
10
+ α
9
+ α
8
+ α
5
+ α
4
+ α
3
v
15
α
15
+ α
12
+ α
11
+ α
8
+ α
4
+ α
3
+ α
2
+ α
Every message word m
i
= m
i,15
. . . m
i,0
consists of 16 bits. As such it could be regarded as binary vector of length 16:
(322) m
i
= (m
i,0
. . . m
i,15
)
Where m
i,0
is the least significant bit of message word m
i
. Accordingly we consider the field element ˜m
i
=
15
j=0
m
i,j
v
j
to represent that message word.
Similarly, we assign a unique index to each validator between 0 and 1,022 and we represent validator i with the field
element:
(323)
˜
i =
15
j=0
i
j
v
j
where i = i
15
. . . i
0
is the binary representation of i.
H.3. The Generator Polynomial. To erasure code a message of 342 words into 1023 code words, we represent each
message as a field element as described in previous section and we interpolate the polynomial p(y) of maximum 341
degree which satisfies the following equalities:
(324)
p(
˜
0)=
m
0
p(
˜
1)=
m
1
p(
341)=
m
341
After finding p(y) with such properties, we evaluate p at the following points:
(325)
r
342
= p(
342)
r
343
= p(
343)
r
1022
= p(
1022)
We then distribute the message words and the extra code words among the validators according to their corresponding
indices.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 56
Appendix I. Index of Notation
I.1. Sets.
I.1.1. Regular Notation.
N: The set of non-negative integers. Subscript denotes one greater than the maximum. See section 3.4.
N
+
: The set of positive integers (not including zero).
N
B
: The set of balance values. Equivalent to N
2
64
. See equation 31.
N
G
: The set of unsigned gas values. Equivalent to N
2
64
. See equation 33.
N
L
: The set of blob length values. Equivalent to N
2
32
. See section 3.4.
N
S
: The set from which service indices are drawn. Equivalent to N
2
32
. See section 87.
N
T
: The set of timeslot values. Equivalent to N
2
32
. See equation 36.
Q: The set of rational numbers. Unused.
R: The set of real numbers. Unused.
Z: The set of integers. Subscript denotes range. See section 3.4.
Z
G
: The set of signed gas values. Equivalent to Z
2
63
...2
63
. See equation 33.
I.1.2. Custom Notation.
A: The set of service accounts. See equation 89.
B: The set of Boolean sequences/bitstrings. Subscript denotes length. See section 3.7.
C: The set of seal-key tickets. See equation 50. Not used as the set of complex numbers.
D: The set of dictionaries. See section 3.5.
DK V : The set of dictionaries making a partial bijection of domain K to range V . See section 3.5.
E: The set of valid Ed25519 signatures. A subset of Y
64
. See section 3.8.
E
K
M: The set of valid Ed25519 signatures of the key K and message M. A subset of E. See section 3.8.
F: The set of Bandersnatch signatures. A subset of Y
64
. See section 3.8. NOTE: Not used as finite fields.
F
M
K
C: The set of Bandersnatch signatures of the public key K, context C and message M . A subset of F.
See section 3.8.
¯
F
:
The set of Bandersnatch Ringvrf proofs. See section 3.8.
¯
F
M
R
C: The set of Bandersnatch Ringvrf proofs of the root R, context C and message M . A subset of
¯
F.
See section 3.8.
G: The set of data segments, equivalent to octet sequences of length W
S
. See equation 173.
H: The set of 32-octet cryptographic values. A subset of Y
32
. H without a subscript generally implies a hash
function result. See section 3.8. NOTE: Not used as quaternions.
H
B
: The set of Bandersnatch public keys. A subset of Y
32
. See section 3.8 and appendix G.
H
E
: The set of Ed25519 public keys. A subset of Y
32
. See section 3.8.2.
I: The set of work items. See equation 175.
J: The set of work execution errors.
K: The set of validator key-sets. See equation 51.
L: The set of work results.
M: The set of pvm ram states. A superset of Y
2
32
. See appendix A.
O: The accumulation operand element, corresponding to a single work result.
P: The set of work-packages. See equation 174.
S: The set of availability specifications.
T: The set of deferred transfers.
U: Unused.
V
µ
: The set of validly readable indices for pvm ram µ. See appendix A.
V
µ
: The set of validly writable indices for pvm ram µ. See appendix A.
W: The set of work-reports.
X: The set of refinement contexts.
Y: The set of octet strings/“blobs”. Subscript denotes length. See section 3.7.
Y
BLS
: The set of BLS public keys. A subset of Y
144
. See section 3.8.2.
Y
R
: The set of Bandersnatch ring roots. A subset of Y
144
. See section 3.8 and appendix G.
I.2. Functions.
Λ: The historical lookup function. See equation 93.
Ξ: The work result computation function. See equation 182.
Υ: The general state transition function. See equations 12, 16.
Φ: The key-nullifier function. See equation 58.
Ψ: The whole-program pvm machine state-transition function. See equation A.
Ψ
1
: The single-step (pvm) machine state-transition function. See appendix A.
Ψ
A
: The Accumulate pvm invocation function. See appendix B.
Ψ
H
: The host-function invocation (
pvm
) with host-function marshalling. See appendix A.
Ψ
I
: The Is-Authorized pvm invocation function. See appendix B.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 57
Ψ
M
: The marshalling whole-program pvm machine state-transition function. See appendix A.
Ψ
R
: The Refine pvm invocation function. See appendix B.
Ψ
T
: The On-Transfer pvm invocation function. See appendix B.
: Virtual machine host-call functions. See appendix B.
A
: Assign-core host-call.
C
: Checkpoint host-call.
D
:
Designate-validators host-call.
E
: Empower-service host-call.
F
: Forget-preimage host-call.
G
: Gas-remaining host-call.
H
: Historical-lookup-preimage host-call.
I
: Information-on-service host-call.
K
: Kickoff-pvm host-call.
L
: Lookup-preimage host-call.
M
: Make-pvm host-call.
N
: New-service host-call.
O
: Poke-pvm host-call.
P
: Peek-pvm host-call.
Q
: Quit-service host-call.
S
: Solicit-preimage host-call.
R
: Read-storage host-call.
T
: Transfer host-call.
U
: Upgrade-service host-call.
W
: Write-storage host-call.
X
: Expunge-pvm host-call.
Y
: Import segment host-call.
Z
: Export segment host-call.
I.3. Utilities, Externalities and Standard Functions.
A(. . . ): The Merkle mountain range append function. See equation 303.
B
n
(. . . ): The octets-to-bits function for n octets. Superscripted
1
to denote the inverse. See equation 222.
C(. . . ): The group of erasure-coding functions.
C
n
(. . . ): The erasure-coding functions for n chunks. See equation 318.
E(. . . ): The octet-sequence encode function. Superscripted
1
to denote the inverse. See appendix C.
F(. . . ): The Fisher-Yates shuffle function. See equation 305.
H(. . . ): The Blake 2b 256-bit hash function. See section 3.8.
H
K
(. . . ): The Keccak 256-bit hash function. See section 3.8.
K(. . . ): The domain, or set of keys, of a dictionary. See section 3.5.
M(. . . ): The constant-depth binary Merklization function. See appendix E.
M
B
(. . . ): The well-balanced binary Merklization function. See appendix E.
M
σ
(. . . ): The state Merklization function. See appendix D.
N (. . . ): The erasure-coding chunks function. See appendix H.
O(. . . ): The Bandersnatch ring root function. See section 3.8 and appendix G.
P
n
(. . . ): The octet-array zero-padding function. See equation 186.
Q(. . . ): The numeric-sequence-from-hash function. See equation 307.
R: The group of erasure-coding piece-recovery functions.
S
k
(. . . ): The general signature function. See section 3.8.
T
:
The current time expressed in seconds after the start of the
J
am
Common Era. See section 4.4.
U(. . . ): The substitute-if-nothing function. See equation 2.
V(. . . ): The range, or set of values, of a dictionary or sequence. See section 3.5.
X
n
(. . . ): The signed-extension function for a value in N
2
8n
. See equation 224.
Y(. . . ): The alias/output/entropy function of a Bandersnatch vrf signature/proof. See section 3.8 and appendix
G.
Z
n
(. . . ): The into-signed function for a value in N
2
8n
. Superscripted with
1
to denote the inverse. See equation
220.
. . . : Power set function.
I.4. Values.
I.4.1. Block-context Terms. These terms are all contextualized to a single block. They may be superscripted with some
other term to alter the context and reference some other block.
A: The ancestor set of the block. See equation 39.
B
:
The block. See equation 13.
C: The service accumulation-commitment, used to form the Beefy root. See equation 163.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 58
E: The block extrinsic. See equation 14.
F
v
: The Beefy signed commitment of validator v. See equation 208.
G: The mapping from cores to guarantor keys. See section 11.3.
G
: The mapping from cores to guarantor keys for the previous rotation. See section 11.3.
H: The block header. See equation 37.
Q: The selection of ready work-reports which a validator determined they must audit. See equation 189.
R
:
The set of Ed25519 guarantor keys who made a work-report. See equation 139.
S: The set of indices of services which have been accumulated (“progressed”) in the block. See equation 157.
T: The ticketed condition, true if the block was sealed with a ticket signature rather than a fallback. See equations
59 and 60.
U: The audit condition, equal to once the block is audited. See section 17.
W: The set of work-reports which have now become available and ready for accumulation. See equation 129.
Without any superscript, the block is assumed to the block being imported or, if no block is being imported, the head
of the best chain (see section 19). Explicit block-contextualizing superscripts include:
B
: The latest finalized block. See equation 19.
B
: The block at the head of the best chain. See equation 19.
I.4.2. State components. Here, the prime annotation indicates posterior state. Individual components may be identified
with a letter subscript.
α: The core αuthorizations pool. See equation 84.
β: Information on the most recent βlocks.
γ: State concerning Safrole. See equation 47.
γ
a
: The sealing lottery ticket accumulator.
γ
k
: The keys for the validators of the next epoch, equivalent to those keys which constitute γ
z
.
γ
s
: The sealing-key sequence of the current epoch.
γ
z
: The Bandersnatch root for the current epoch’s ticket submissions.
δ: The (prior) state of the service accounts.
δ
: The post-preimage integration, pre-accumulation intermediate state.
δ
: The post-accumulation, pre-transfer intermediate state.
η: The eηtropy accumulator and epochal raηdomness.
ι: The validator keys and metadata to be drawn from next.
κ: The validator κeys and metadata currently active.
λ: The validator keys and metadata which were active in the prior epoch.
ρ: The ρending reports, per core, which are being made available prior to accumulation.
ρ
: The post-judgement, pre-guarantees-extrinsic intermediate state.
ρ
: The post-guarantees-extrinsic, pre-assurances-extrinsic, intermediate state.
σ: The σverall state of the system. See equations 12, 15.
τ: The most recent block’s τimeslot.
φ: The authorization queue.
ψ: Past judgements on work-reports and validators.
ψ
b
: Work-reports judged to be incorrect.
ψ
g
: Work-reports judged to be correct.
ψ
w
: Work-reports whose validity is judged to be unknowable.
ψ
o
: Validators who made a judgement found to be incorrect.
χ: The privileged service indices.
χ
m
: The index of the empower service.
χ
v
: The index of the designate service.
χ
a
: The index of the assign service.
π: The activity statistics for the validators.
I.4.3. Virtual Machine components.
ε: The exit-reason resulting from all machine state transitions.
ν: The immediate values of an instruction.
µ: The memory sequence; a member of the set M.
ξ: The gas counter.
ω: The registers.
ζ: The instruction sequence.
ϖ: The sequence of basic blocks of the program.
ı: The instruction counter.
I.4.4. Constants.
A = 8: The period, in seconds, between audit tranches.
B
I
=
10
: The additional minimum balance required per item of elective service state.
B
L
= 1: The additional minimum balance required per octet of elective service state.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.3.8
September 23, 2024 59
B
S
= 100 : The basic minimum balance which all services require.
C = 341: The total number of cores.
D = 28, 800: The period in timeslots after which an unreferenced preimage may be expunged.
E = 600 : The length of an epoch in timeslots.
F = 2: The audit bias factor, the expected number of additional validators who will audit a work-report in the
following tranche for each no-show in the previous.
G
A
:
The total gas allocated to a core for Accumulation.
G
I
: The gas allocated to invoke a work-package’s Is-Authorized logic.
G
R
: The total gas allocated for a work-package’s Refine logic.
H = 8: The size of recent history, in blocks.
I = 4: The maximum amount of work items in a package.
K = 16: The maximum number of tickets which may be submitted in a single extrinsic.
L = 14, 400: The maximum age in timeslots of the lookup anchor.
M = 128: The size of a transfer memo in octets.
N = 2: The number of ticket entries per validator.
O = 8: The maximum number of items in the authorizations pool.
P = 6: The slot period, in seconds.
Q = 80: The maximum number of items in the authorizations queue.
R = 10: The rotation period of validator-core assignments, in timeslots.
S = 4, 000, 000: The maximum size of service code in octets.
U = 5: The period in timeslots after which reported but unavailable work may be replaced.
V = 1023: The total number of validators.
W
C
= 684 : The basic size of our erasure-coded pieces. See equation 318.
W
M
= 2
11
: The maximum number of entries in a work-package manifest.
W
P
= 12 2
20
: The maximum size of an encoded work-package together with its extrinsic data and import impli-
cations, in octets.
W
R
= 96 2
10
: The maximum size of an encoded work-report in octets.
W
S
= 6: The size of an exported segment in erasure-coded pieces.
X: Context strings, see below.
Y = 500: The number of slots into an epoch at which ticket-submission ends.
Z
A
= 2: The pvm dynamic address alignment factor. See equation 226.
Z
I
= 2
24
: The standard pvm program initialization input data size. See equation A.7.
Z
P
= 2
14
: The standard pvm program initialization page size. See section A.7.
Z
Q
= 2
16
: The standard pvm program initialization segment size. See section A.7.
I.4.5. Signing Contexts.
X
A
= $jam_available: Ed25519 Availability assurances.
X
B
= $jam_beefy: bls Accumulate-result-root-mmr commitment.
X
E
= $jam_entropy: On-chain entropy generation.
X
F
= $jam_fallback_seal: Bandersnatch Fallback block seal.
X
G
= $jam_guarantee: Ed25519 Guarantee statements.
X
I
= $jam_announce: Ed25519 Audit announcement statements.
X
T
= $jam_ticket_seal: Bandersnatch Ringvrf Ticket generation and regular block seal.
X
U
= $jam_audit: Bandersnatch Audit selection entropy.
X
= $jam_valid: Ed25519 Judgements for valid work-reports.
X
= $jam_invalid: Ed25519 Judgements for invalid work-reports.
REFERENCES 60
References
Bertoni, Guido et al. (2013). “Keccak”. In: Annual international conference on the theory and applications of cryptographic
techniques. Springer, pp. 313–314.
Bögli, Roman (2024). “Assessing risc Zero using ZKit: An Extensible Testing and Benchmarking Suite for ZKP Frame-
works”. PhD thesis. OST Ostschweizer Fachhochschule.
Boneh, Dan, Ben Lynn, and Hovav Shacham (2004). “Short Signatures from the Weil Pairing”. In: J. Cryptology 17,
pp. 297–319. doi: 10.1007/s00145-004-0314-9.
Burdges, Jeff, Alfonso Cevallos, et al. (2024). Efficient Execution Auditing for Blockchains under Byzantine Assumptions.
Cryptology ePrint Archive, Paper 2024/961. https://eprint.iacr.org/2024/961. url: https://eprint.iacr.org/
2024/961.
Burdges, Jeff, Oana Ciobotaru, et al. (2022). Efficient Aggregatable BLS Signatures with Chaum-Pedersen Proofs. Cryp-
tology ePrint Archive, Paper 2022/1611. https://eprint.iacr.org/2022/1611. url: https://eprint.iacr.org/
2022/1611.
Burdges, Jeffrey et al. (2023). Ring Verifiable Random Functions and Zero-Knowledge Continuations. Cryptology ePrint
Archive, Paper 2023/002. url: https://eprint.iacr.org/2023/002.
Buterin, Vitalik (2013). Ethereum: A Next-Generation Smart Contract and Decentralized Application Platform. url:
https://github.com/ethereum/wiki/wiki/White-Paper.
Buterin, Vitalik and Virgil Griffith (2019). Casper the Friendly Finality Gadget. arXiv: 1710.09437 [cs.CR].
Cosmos Project (2023). Interchain Security Begins a New Era for Cosmos. Fetched 18th March, 2024. url: https:
//blog.cosmos.network/interchain-security-begins-a-new-era-for-cosmos-a2dc3c0be63.
Dune and hildobby (2024). Ethereum Staking. Fetched 18th March, 2024. url: https://dune.com/hildobby/eth2-
staking.
Ethereum Foundation (2024a). “A digital future on a global scale”. In: Fetched 4th April, 2024. url: https://ethereum.
org/en/roadmap/vision/.
(2024b). Danksharding. Fetched 18th March, 2024. url: https://ethereum.org/en/roadmap/danksharding/.
Fisher, Ronald Aylmer and Frank Yates (1938). Statistical tables for biological, agricultural and medical research. Oliver
and Boyd.
Gabizon, Ariel, Zachary J. Williamson, and Oana Ciobotaru (2019). PLONK: Permutations over Lagrange-bases for
Oecumenical Noninteractive arguments of Knowledge. Cryptology ePrint Archive, Paper 2019/953. url: https://
eprint.iacr.org/2019/953.
Goldberg, Sharon et al. (Aug. 2023). Verifiable Random Functions (VRFs). RFC 9381. doi: 10.17487/RFC9381. url:
https://www.rfc-editor.org/info/rfc9381.
Hertig, Alyssa (2016). So, Ethereum’s Blockchain is Still Under Attack... Fetched 18th March, 2024. url: https :
//www.coindesk.com/markets/2016/10/06/so-ethereums-blockchain-is-still-under-attack/.
Hopwood, Daira et al. (2020). BLS12-381. url: https://z.cash/technology/jubjub/.
Hosseini, Seyed and Davide Galassi (2024). “Bandersnatch VRF-AD Specification”. In: Fetched 4th April, 2024. url:
https://github.com/davxy/bandersnatch-vrfs-spec/blob/main/specification.pdf.
Jha, Prashant (2024). Solana outage raises questions about client diversity and beta status. Fetched 18th March, 2024.
url: https://cointelegraph.com/news/solana-outage-client-diversity-beta.
Josefsson, Simon and Ilari Liusvaara (Jan. 2017). Edwards-Curve Digital Signature Algorithm (EdDSA). RFC 8032. doi:
10.17487/RFC8032. url: https://www.rfc-editor.org/info/rfc8032.
Kokoris-Kogias, Eleftherios et al. (2017). OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding. Cryptology
ePrint Archive, Paper 2017/406. https://eprint.iacr.org/2017/406.
url
: https://eprint.iacr.org/2017/406.
Kwon, Jae and Ethan Buchman (2019). “Cosmos whitepaper”. In: A Netw. Distrib. Ledgers 27, pp. 1–32.
Lin, Sian-Jheng, Wei-Ho Chung, and Yunghsiang S. Han (2014). “Novel Polynomial Basis and Its Application to Reed-
Solomon Erasure Codes”. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 316–325.
doi: 10.1109/FOCS.2014.41.
Masson, Simon, Antonio Sanso, and Zhenfei Zhang (2021). Bandersnatch: a fast elliptic curve built over the BLS12-381
scalar field. Cryptology ePrint Archive, Paper 2021/1152. url: https://eprint.iacr.org/2021/1152.
Ng, Felix (2024). Is measuring blockchain transactions per second stupid in 2024? Fetched 18th March, 2024. url:
https://cointelegraph.com/magazine/blockchain-transactions-per-second-tps-stupid-big-questions/.
PolkaVM Project (2024). “PolkaVM/RISC0 Benchmark Results”. In: Fetched 3rd April, 2024. url: https://github.
com/koute/risc0-benchmark/blob/master/README.md.
Saarinen, Markku-Juhani O. and Jean-Philippe Aumasson (Nov. 2015). The BLAKE2 Cryptographic Hash and Message
Authentication Code (MAC). RFC 7693. doi: 10.17487/RFC7693. url: https://www.rfc-editor.org/info/rfc7693.
Sadana, Apoorv (2024). Bringing Polkadot tech to Ethereum. Fetched 18th March, 2024. url: https://ethresear.ch/
t/bringing-polkadot-tech-to-ethereum/17104.
Sharma, Shivam (2023). Ethereum’s Rollups are Centralized. url: https://public.bnbstatic.com/static/files/
research/ethereums-rollups-are-centralized-a-look-into-decentralized-sequencers.pdf.
Solana Foundation (2023). Solana data goes live on Google Cloud BigQuery. Fetched 18th March, 2024. url: https:
//solana.com/news/solana-data-live-on-google-cloud-bigquery.
REFERENCES 61
Solana Labs (2024). Solana Validator Requirements. Fetched 18th March, 2024. url: https://docs.solanalabs.com/
operations/requirements.
Stewart, Alistair and Eleftherios Kokoris-Kogia (2020). “Grandpa: a byzantine finality gadget”. In: arXiv preprint
arXiv:2007.01560.
Tanana, Dmitry (2019). “Avalanche blockchain protocol for distributed computing security”. In: 2019 IEEE International
Black Sea Conference on Communications and Networking (BlackSeaCom). IEEE, pp. 1–3.
Thaler, Justin (2023). “A technical FAQ on Lasso, Jolt, and recent advancements in SNARK design”. In: Fetched 3rd
April, 2024. url: https: //a16zcrypto .com/ posts/article /a- technical- faq- on- lasso- jolt- and- recent-
advancements-in-snark-design/.
Wikipedia (2024). Fisher-Yates shuffle: The modern algorithm. url: https://en.wikipedia.org/wiki/Fisher%5C%E2%
5C%80%5C%93Yates_shuffle%5C#The_modern_algorithm.
Wood, Gavin (2014). “Ethereum: A secure decentralised generalised transaction ledger”. In: Ethereum project yellow
paper 151, pp. 1–32.
Yakovenko, Anatoly (2018). “Solana: A new architecture for a high performance blockchain v0. 8.13”. In.